# 🚜 Agricultural Photo Keyword Training Guide ## Overview This guide explains how to train a custom agricultural keyword generation model using your 30,000 tagged photos dataset. ## 📋 Prerequisites ### 1. Hardware Requirements - **GPU**: NVIDIA GPU with 8GB+ VRAM (recommended) - **RAM**: 16GB+ system RAM - **Storage**: 50GB+ free space for model and data ### 2. Software Requirements ```bash # Install additional training dependencies pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install transformers datasets accelerate pip install scikit-learn tqdm ``` ## 📁 Data Preparation ### 1. Organize Your 30,000 Photos ``` data/training/ ├── photo_001.jpg ├── photo_002.jpg ├── ... ├── photo_30000.jpg └── metadata.csv ``` ### 2. Create Metadata CSV Your `metadata.csv` should have this format: ```csv filename,keywords photo_001.jpg,"farmer, corn, field, agriculture, male, tractor" photo_002.jpg,"dairy cow, barn, livestock, farming, rural" photo_003.jpg,"chicken, poultry, farm, feeding, outdoor" ... ``` **Required columns:** - `filename`: Image filename (must exist in data/training/) - `keywords`: Comma-separated keywords for the image ## 🚀 Training Process ### Step 1: Prepare Sample Data (Testing) ```bash # Create sample data for testing the pipeline python3 src/train_model.py --create-sample --data-dir data/training ``` ### Step 2: Train on Your 30,000 Photos ```bash # Basic training command python3 src/train_model.py \ --data-dir data/training \ --metadata-file data/training/metadata.csv \ --epochs 5 \ --batch-size 8 \ --learning-rate 5e-5 # Advanced training with custom settings python3 src/train_model.py \ --data-dir data/training \ --metadata-file data/training/metadata.csv \ --output-dir models/custom_agricultural_model \ --epochs 10 \ --batch-size 16 \ --learning-rate 3e-5 \ --val-split 0.15 \ --num-workers 8 ``` ### Step 3: Monitor Training Training logs are saved to `models/agricultural_blip/training.log`: ```bash # Monitor training progress tail -f models/agricultural_blip/training.log ``` ### Step 4: Use Trained Model ```bash # Use your custom trained model for inference python3 src/main.py \ --input data/raw \ --output outputs \ --model-path models/agricultural_blip/best_model ``` ## ⚙️ Training Parameters ### Key Parameters | Parameter | Default | Description | |-----------|---------|-------------| | `--epochs` | 5 | Number of training epochs | | `--batch-size` | 8 | Training batch size (reduce if GPU memory issues) | | `--learning-rate` | 5e-5 | Learning rate for optimization | | `--val-split` | 0.2 | Fraction of data for validation | | `--num-workers` | 4 | Data loading workers | ### GPU Memory Optimization If you encounter GPU memory issues: ```bash # Reduce batch size python3 src/train_model.py --batch-size 4 # Use gradient accumulation (simulates larger batch) # This is handled automatically in the training code ``` ## 📊 Training Monitoring ### Training Metrics The training script tracks: - **Training Loss**: How well model fits training data - **Validation Loss**: How well model generalizes - **Learning Rate**: Optimization parameter schedule ### Expected Training Time - **30,000 photos**: ~6-12 hours on modern GPU - **Batch size 8**: ~45 minutes per epoch - **Early stopping**: Training stops if no improvement ### Model Checkpoints Models are saved to `models/agricultural_blip/`: - `best_model/`: Best performing model (lowest validation loss) - `final_model/`: Model after all epochs - `checkpoint_epoch_N/`: Intermediate checkpoints ## 🎯 Training Data Quality ### Keyword Quality Guidelines For best results, ensure your 30,000 photos have: 1. **Consistent Keywords**: Use standardized terms - ✅ "farmer" not "farm worker" or "agricultural worker" - ✅ "tractor" not "farm equipment" or "machinery" 2. **Specific Agricultural Terms**: - ✅ "dairy farmer" vs "rancher" vs "chicken farmer" - ✅ "corn field" vs "wheat field" vs "soybean field" 3. **5-10 Keywords per Image**: Optimal range for training 4. **Balanced Dataset**: Include variety of: - Crops (corn, wheat, soy, etc.) - Livestock (cattle, pigs, chickens) - Equipment (tractors, harvesters) - People (farmers, ranchers, workers) - Settings (fields, barns, farms) ### Data Analysis Before training, analyze your dataset: ```bash # The training script will show data analysis python3 src/train_model.py --data-dir data/training --metadata-file data/training/metadata.csv ``` ## 🔧 Troubleshooting ### Common Issues **1. GPU Out of Memory** ```bash # Solution: Reduce batch size python3 src/train_model.py --batch-size 4 ``` **2. Training Too Slow** ```bash # Solution: Increase batch size and workers (if GPU allows) python3 src/train_model.py --batch-size 16 --num-workers 8 ``` **3. Poor Model Performance** - Check keyword quality and consistency - Increase training epochs - Verify image quality and variety **4. Model Not Loading** ```bash # Check if model path exists ls -la models/agricultural_blip/best_model/ ``` ## 📈 Performance Expectations ### After Training on 30,000 Photos - **Keyword Accuracy**: 80-90% relevant keywords - **Agricultural Distinctions**: Improved farmer vs rancher detection - **Domain Specificity**: Better recognition of agricultural terms - **Processing Speed**: Same as pre-trained model (~3 seconds/image) ### Validation Metrics - **Training Loss**: Should decrease over epochs - **Validation Loss**: Should decrease and stabilize - **Early Stopping**: Prevents overfitting ## 🚀 Production Deployment ### Using Trained Model ```bash # Replace pre-trained model with your custom model python3 src/main.py \ --input data/raw \ --output outputs \ --model-path models/agricultural_blip/best_model ``` ### Model Sharing Your trained model can be shared by copying: ``` models/agricultural_blip/best_model/ ├── config.json ├── pytorch_model.bin ├── preprocessor_config.json ├── tokenizer.json ├── tokenizer_config.json └── training_state.pt ``` ## 📋 Training Checklist - [ ] **Hardware**: GPU with 8GB+ VRAM available - [ ] **Data**: 30,000 photos organized in data/training/ - [ ] **Metadata**: CSV file with filename and keywords columns - [ ] **Dependencies**: Training packages installed - [ ] **Storage**: 50GB+ free space - [ ] **Time**: 6-12 hours available for training - [ ] **Monitoring**: Training logs being tracked ## 🎯 Next Steps 1. **Prepare your 30,000 photo dataset** 2. **Create metadata.csv with keywords** 3. **Run training script** 4. **Evaluate trained model performance** 5. **Deploy for production use** --- **Ready to train?** Start with sample data to test the pipeline, then scale to your full 30,000 photo dataset!