TRAINING_GUIDE.md

# 🚜 Agricultural Photo Keyword Training Guide

## Overview

This guide explains how to train a custom agricultural keyword generation model using your 30,000 tagged photos dataset.

## 📋 Prerequisites

### 1. Hardware Requirements
- **GPU**: NVIDIA GPU with 8GB+ VRAM (recommended)
- **RAM**: 16GB+ system RAM
- **Storage**: 50GB+ free space for model and data

### 2. Software Requirements
```bash
# Install additional training dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers datasets accelerate
pip install scikit-learn tqdm
```

## 📁 Data Preparation

### 1. Organize Your 30,000 Photos
```
data/training/
├── photo_001.jpg
├── photo_002.jpg
├── ...
├── photo_30000.jpg
└── metadata.csv
```

### 2. Create Metadata CSV
Your `metadata.csv` should have this format:
```csv
filename,keywords
photo_001.jpg,"farmer, corn, field, agriculture, male, tractor"
photo_002.jpg,"dairy cow, barn, livestock, farming, rural"
photo_003.jpg,"chicken, poultry, farm, feeding, outdoor"
...
```

**Required columns:**
- `filename`: Image filename (must exist in data/training/)
- `keywords`: Comma-separated keywords for the image

## 🚀 Training Process

### Step 1: Prepare Sample Data (Testing)
```bash
# Create sample data for testing the pipeline
python3 src/train_model.py --create-sample --data-dir data/training
```

### Step 2: Train on Your 30,000 Photos
```bash
# Basic training command
python3 src/train_model.py \
    --data-dir data/training \
    --metadata-file data/training/metadata.csv \
    --epochs 5 \
    --batch-size 8 \
    --learning-rate 5e-5

# Advanced training with custom settings
python3 src/train_model.py \
    --data-dir data/training \
    --metadata-file data/training/metadata.csv \
    --output-dir models/custom_agricultural_model \
    --epochs 10 \
    --batch-size 16 \
    --learning-rate 3e-5 \
    --val-split 0.15 \
    --num-workers 8
```

### Step 3: Monitor Training
Training logs are saved to `models/agricultural_blip/training.log`:
```bash
# Monitor training progress
tail -f models/agricultural_blip/training.log
```

### Step 4: Use Trained Model
```bash
# Use your custom trained model for inference
python3 src/main.py \
    --input data/raw \
    --output outputs \
    --model-path models/agricultural_blip/best_model
```

## ⚙️ Training Parameters

### Key Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| `--epochs` | 5 | Number of training epochs |
| `--batch-size` | 8 | Training batch size (reduce if GPU memory issues) |
| `--learning-rate` | 5e-5 | Learning rate for optimization |
| `--val-split` | 0.2 | Fraction of data for validation |
| `--num-workers` | 4 | Data loading workers |

### GPU Memory Optimization
If you encounter GPU memory issues:
```bash
# Reduce batch size
python3 src/train_model.py --batch-size 4

# Use gradient accumulation (simulates larger batch)
# This is handled automatically in the training code
```

## 📊 Training Monitoring

### Training Metrics
The training script tracks:
- **Training Loss**: How well model fits training data
- **Validation Loss**: How well model generalizes
- **Learning Rate**: Optimization parameter schedule

### Expected Training Time
- **30,000 photos**: ~6-12 hours on modern GPU
- **Batch size 8**: ~45 minutes per epoch
- **Early stopping**: Training stops if no improvement

### Model Checkpoints
Models are saved to `models/agricultural_blip/`:
- `best_model/`: Best performing model (lowest validation loss)
- `final_model/`: Model after all epochs
- `checkpoint_epoch_N/`: Intermediate checkpoints

## 🎯 Training Data Quality

### Keyword Quality Guidelines
For best results, ensure your 30,000 photos have:

1. **Consistent Keywords**: Use standardized terms
   - ✅ "farmer" not "farm worker" or "agricultural worker"
   - ✅ "tractor" not "farm equipment" or "machinery"

2. **Specific Agricultural Terms**:
   - ✅ "dairy farmer" vs "rancher" vs "chicken farmer"
   - ✅ "corn field" vs "wheat field" vs "soybean field"

3. **5-10 Keywords per Image**: Optimal range for training

4. **Balanced Dataset**: Include variety of:
   - Crops (corn, wheat, soy, etc.)
   - Livestock (cattle, pigs, chickens)
   - Equipment (tractors, harvesters)
   - People (farmers, ranchers, workers)
   - Settings (fields, barns, farms)

### Data Analysis
Before training, analyze your dataset:
```bash
# The training script will show data analysis
python3 src/train_model.py --data-dir data/training --metadata-file data/training/metadata.csv
```

## 🔧 Troubleshooting

### Common Issues

**1. GPU Out of Memory**
```bash
# Solution: Reduce batch size
python3 src/train_model.py --batch-size 4
```

**2. Training Too Slow**
```bash
# Solution: Increase batch size and workers (if GPU allows)
python3 src/train_model.py --batch-size 16 --num-workers 8
```

**3. Poor Model Performance**
- Check keyword quality and consistency
- Increase training epochs
- Verify image quality and variety

**4. Model Not Loading**
```bash
# Check if model path exists
ls -la models/agricultural_blip/best_model/
```

## 📈 Performance Expectations

### After Training on 30,000 Photos
- **Keyword Accuracy**: 80-90% relevant keywords
- **Agricultural Distinctions**: Improved farmer vs rancher detection
- **Domain Specificity**: Better recognition of agricultural terms
- **Processing Speed**: Same as pre-trained model (~3 seconds/image)

### Validation Metrics
- **Training Loss**: Should decrease over epochs
- **Validation Loss**: Should decrease and stabilize
- **Early Stopping**: Prevents overfitting

## 🚀 Production Deployment

### Using Trained Model
```bash
# Replace pre-trained model with your custom model
python3 src/main.py \
    --input data/raw \
    --output outputs \
    --model-path models/agricultural_blip/best_model
```

### Model Sharing
Your trained model can be shared by copying:
```
models/agricultural_blip/best_model/
├── config.json
├── pytorch_model.bin
├── preprocessor_config.json
├── tokenizer.json
├── tokenizer_config.json
└── training_state.pt
```

## 📋 Training Checklist

- [ ] **Hardware**: GPU with 8GB+ VRAM available
- [ ] **Data**: 30,000 photos organized in data/training/
- [ ] **Metadata**: CSV file with filename and keywords columns
- [ ] **Dependencies**: Training packages installed
- [ ] **Storage**: 50GB+ free space
- [ ] **Time**: 6-12 hours available for training
- [ ] **Monitoring**: Training logs being tracked

## 🎯 Next Steps

1. **Prepare your 30,000 photo dataset**
2. **Create metadata.csv with keywords**
3. **Run training script**
4. **Evaluate trained model performance**
5. **Deploy for production use**

---

**Ready to train?** Start with sample data to test the pipeline, then scale to your full 30,000 photo dataset!
🎯 FINAL 5% COMPLETED - Custom Training Pipeline for 30,000 Photos 2025-07-16 20:45:50 +01:00			`# 🚜 Agricultural Photo Keyword Training Guide`

			`## Overview`

			`This guide explains how to train a custom agricultural keyword generation model using your 30,000 tagged photos dataset.`

			`## 📋 Prerequisites`

			`### 1. Hardware Requirements`
			`- GPU: NVIDIA GPU with 8GB+ VRAM (recommended)`
			`- RAM: 16GB+ system RAM`
			`- Storage: 50GB+ free space for model and data`

			`### 2. Software Requirements`
			```bash
			`# Install additional training dependencies`
			`pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118`
			`pip install transformers datasets accelerate`
			`pip install scikit-learn tqdm`
			```

			`## 📁 Data Preparation`

			`### 1. Organize Your 30,000 Photos`
			```
			`data/training/`
			`├── photo_001.jpg`
			`├── photo_002.jpg`
			`├── ...`
			`├── photo_30000.jpg`
			`└── metadata.csv`
			```

			`### 2. Create Metadata CSV`
			Your `metadata.csv` should have this format:
			```csv
			`filename,keywords`
			`photo_001.jpg,"farmer, corn, field, agriculture, male, tractor"`
			`photo_002.jpg,"dairy cow, barn, livestock, farming, rural"`
			`photo_003.jpg,"chicken, poultry, farm, feeding, outdoor"`
			`...`
			```

			`Required columns:`
			- `filename`: Image filename (must exist in data/training/)
			- `keywords`: Comma-separated keywords for the image

			`## 🚀 Training Process`

			`### Step 1: Prepare Sample Data (Testing)`
			```bash
			`# Create sample data for testing the pipeline`
			`python3 src/train_model.py --create-sample --data-dir data/training`
			```

			`### Step 2: Train on Your 30,000 Photos`
			```bash
			`# Basic training command`
			`python3 src/train_model.py \`
			`--data-dir data/training \`
			`--metadata-file data/training/metadata.csv \`
			`--epochs 5 \`
			`--batch-size 8 \`
			`--learning-rate 5e-5`

			`# Advanced training with custom settings`
			`python3 src/train_model.py \`
			`--data-dir data/training \`
			`--metadata-file data/training/metadata.csv \`
			`--output-dir models/custom_agricultural_model \`
			`--epochs 10 \`
			`--batch-size 16 \`
			`--learning-rate 3e-5 \`
			`--val-split 0.15 \`
			`--num-workers 8`
			```

			`### Step 3: Monitor Training`
			Training logs are saved to `models/agricultural_blip/training.log`:
			```bash
			`# Monitor training progress`
			`tail -f models/agricultural_blip/training.log`
			```

			`### Step 4: Use Trained Model`
			```bash
			`# Use your custom trained model for inference`
			`python3 src/main.py \`
			`--input data/raw \`
			`--output outputs \`
			`--model-path models/agricultural_blip/best_model`
			```

			`## ⚙️ Training Parameters`

			`### Key Parameters`
			`\| Parameter \| Default \| Description \|`
			`\|-----------\|---------\|-------------\|`
			\| `--epochs` \| 5 \| Number of training epochs \|
			\| `--batch-size` \| 8 \| Training batch size (reduce if GPU memory issues) \|
			\| `--learning-rate` \| 5e-5 \| Learning rate for optimization \|
			\| `--val-split` \| 0.2 \| Fraction of data for validation \|
			\| `--num-workers` \| 4 \| Data loading workers \|

			`### GPU Memory Optimization`
			`If you encounter GPU memory issues:`
			```bash
			`# Reduce batch size`
			`python3 src/train_model.py --batch-size 4`

			`# Use gradient accumulation (simulates larger batch)`
			`# This is handled automatically in the training code`
			```

			`## 📊 Training Monitoring`

			`### Training Metrics`
			`The training script tracks:`
			`- Training Loss: How well model fits training data`
			`- Validation Loss: How well model generalizes`
			`- Learning Rate: Optimization parameter schedule`

			`### Expected Training Time`
			`- 30,000 photos: ~6-12 hours on modern GPU`
			`- Batch size 8: ~45 minutes per epoch`
			`- Early stopping: Training stops if no improvement`

			`### Model Checkpoints`
			Models are saved to `models/agricultural_blip/`:
			- `best_model/`: Best performing model (lowest validation loss)
			- `final_model/`: Model after all epochs
			- `checkpoint_epoch_N/`: Intermediate checkpoints

			`## 🎯 Training Data Quality`

			`### Keyword Quality Guidelines`
			`For best results, ensure your 30,000 photos have:`

			`1. Consistent Keywords: Use standardized terms`
			`- ✅ "farmer" not "farm worker" or "agricultural worker"`
			`- ✅ "tractor" not "farm equipment" or "machinery"`

			`2. Specific Agricultural Terms:`
			`- ✅ "dairy farmer" vs "rancher" vs "chicken farmer"`
			`- ✅ "corn field" vs "wheat field" vs "soybean field"`

			`3. 5-10 Keywords per Image: Optimal range for training`

			`4. Balanced Dataset: Include variety of:`
			`- Crops (corn, wheat, soy, etc.)`
			`- Livestock (cattle, pigs, chickens)`
			`- Equipment (tractors, harvesters)`
			`- People (farmers, ranchers, workers)`
			`- Settings (fields, barns, farms)`

			`### Data Analysis`
			`Before training, analyze your dataset:`
			```bash
			`# The training script will show data analysis`
			`python3 src/train_model.py --data-dir data/training --metadata-file data/training/metadata.csv`
			```

			`## 🔧 Troubleshooting`

			`### Common Issues`

			`1. GPU Out of Memory`
			```bash
			`# Solution: Reduce batch size`
			`python3 src/train_model.py --batch-size 4`
			```

			`2. Training Too Slow`
			```bash
			`# Solution: Increase batch size and workers (if GPU allows)`
			`python3 src/train_model.py --batch-size 16 --num-workers 8`
			```

			`3. Poor Model Performance`
			`- Check keyword quality and consistency`
			`- Increase training epochs`
			`- Verify image quality and variety`

			`4. Model Not Loading`
			```bash
			`# Check if model path exists`
			`ls -la models/agricultural_blip/best_model/`
			```

			`## 📈 Performance Expectations`

			`### After Training on 30,000 Photos`
			`- Keyword Accuracy: 80-90% relevant keywords`
			`- Agricultural Distinctions: Improved farmer vs rancher detection`
			`- Domain Specificity: Better recognition of agricultural terms`
			`- Processing Speed: Same as pre-trained model (~3 seconds/image)`

			`### Validation Metrics`
			`- Training Loss: Should decrease over epochs`
			`- Validation Loss: Should decrease and stabilize`
			`- Early Stopping: Prevents overfitting`

			`## 🚀 Production Deployment`

			`### Using Trained Model`
			```bash
			`# Replace pre-trained model with your custom model`
			`python3 src/main.py \`
			`--input data/raw \`
			`--output outputs \`
			`--model-path models/agricultural_blip/best_model`
			```

			`### Model Sharing`
			`Your trained model can be shared by copying:`
			```
			`models/agricultural_blip/best_model/`
			`├── config.json`
			`├── pytorch_model.bin`
			`├── preprocessor_config.json`
			`├── tokenizer.json`
			`├── tokenizer_config.json`
			`└── training_state.pt`
			```

			`## 📋 Training Checklist`

			`- [ ] Hardware: GPU with 8GB+ VRAM available`
			`- [ ] Data: 30,000 photos organized in data/training/`
			`- [ ] Metadata: CSV file with filename and keywords columns`
			`- [ ] Dependencies: Training packages installed`
			`- [ ] Storage: 50GB+ free space`
			`- [ ] Time: 6-12 hours available for training`
			`- [ ] Monitoring: Training logs being tracked`

			`## 🎯 Next Steps`

			`1. Prepare your 30,000 photo dataset`
			`2. Create metadata.csv with keywords`
			`3. Run training script`
			`4. Evaluate trained model performance`
			`5. Deploy for production use`

			`---`

			`Ready to train? Start with sample data to test the pipeline, then scale to your full 30,000 photo dataset!`