Files
ds-smart-farm-project/TRAINING_GUIDE.md
T

247 lines
6.7 KiB
Markdown
Raw Normal View History

# 🚜 Agricultural Photo Keyword Training Guide
## Overview
This guide explains how to train a custom agricultural keyword generation model using your 30,000 tagged photos dataset.
## 📋 Prerequisites
### 1. Hardware Requirements
- **GPU**: NVIDIA GPU with 8GB+ VRAM (recommended)
- **RAM**: 16GB+ system RAM
- **Storage**: 50GB+ free space for model and data
### 2. Software Requirements
```bash
# Install additional training dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers datasets accelerate
pip install scikit-learn tqdm
```
## 📁 Data Preparation
### 1. Organize Your 30,000 Photos
```
data/training/
├── photo_001.jpg
├── photo_002.jpg
├── ...
├── photo_30000.jpg
└── metadata.csv
```
### 2. Create Metadata CSV
Your `metadata.csv` should have this format:
```csv
filename,keywords
photo_001.jpg,"farmer, corn, field, agriculture, male, tractor"
photo_002.jpg,"dairy cow, barn, livestock, farming, rural"
photo_003.jpg,"chicken, poultry, farm, feeding, outdoor"
...
```
**Required columns:**
- `filename`: Image filename (must exist in data/training/)
- `keywords`: Comma-separated keywords for the image
## 🚀 Training Process
### Step 1: Prepare Sample Data (Testing)
```bash
# Create sample data for testing the pipeline
python3 src/train_model.py --create-sample --data-dir data/training
```
### Step 2: Train on Your 30,000 Photos
```bash
# Basic training command
python3 src/train_model.py \
--data-dir data/training \
--metadata-file data/training/metadata.csv \
--epochs 5 \
--batch-size 8 \
--learning-rate 5e-5
# Advanced training with custom settings
python3 src/train_model.py \
--data-dir data/training \
--metadata-file data/training/metadata.csv \
--output-dir models/custom_agricultural_model \
--epochs 10 \
--batch-size 16 \
--learning-rate 3e-5 \
--val-split 0.15 \
--num-workers 8
```
### Step 3: Monitor Training
Training logs are saved to `models/agricultural_blip/training.log`:
```bash
# Monitor training progress
tail -f models/agricultural_blip/training.log
```
### Step 4: Use Trained Model
```bash
# Use your custom trained model for inference
python3 src/main.py \
--input data/raw \
--output outputs \
--model-path models/agricultural_blip/best_model
```
## ⚙️ Training Parameters
### Key Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| `--epochs` | 5 | Number of training epochs |
| `--batch-size` | 8 | Training batch size (reduce if GPU memory issues) |
| `--learning-rate` | 5e-5 | Learning rate for optimization |
| `--val-split` | 0.2 | Fraction of data for validation |
| `--num-workers` | 4 | Data loading workers |
### GPU Memory Optimization
If you encounter GPU memory issues:
```bash
# Reduce batch size
python3 src/train_model.py --batch-size 4
# Use gradient accumulation (simulates larger batch)
# This is handled automatically in the training code
```
## 📊 Training Monitoring
### Training Metrics
The training script tracks:
- **Training Loss**: How well model fits training data
- **Validation Loss**: How well model generalizes
- **Learning Rate**: Optimization parameter schedule
### Expected Training Time
- **30,000 photos**: ~6-12 hours on modern GPU
- **Batch size 8**: ~45 minutes per epoch
- **Early stopping**: Training stops if no improvement
### Model Checkpoints
Models are saved to `models/agricultural_blip/`:
- `best_model/`: Best performing model (lowest validation loss)
- `final_model/`: Model after all epochs
- `checkpoint_epoch_N/`: Intermediate checkpoints
## 🎯 Training Data Quality
### Keyword Quality Guidelines
For best results, ensure your 30,000 photos have:
1. **Consistent Keywords**: Use standardized terms
- ✅ "farmer" not "farm worker" or "agricultural worker"
- ✅ "tractor" not "farm equipment" or "machinery"
2. **Specific Agricultural Terms**:
- ✅ "dairy farmer" vs "rancher" vs "chicken farmer"
- ✅ "corn field" vs "wheat field" vs "soybean field"
3. **5-10 Keywords per Image**: Optimal range for training
4. **Balanced Dataset**: Include variety of:
- Crops (corn, wheat, soy, etc.)
- Livestock (cattle, pigs, chickens)
- Equipment (tractors, harvesters)
- People (farmers, ranchers, workers)
- Settings (fields, barns, farms)
### Data Analysis
Before training, analyze your dataset:
```bash
# The training script will show data analysis
python3 src/train_model.py --data-dir data/training --metadata-file data/training/metadata.csv
```
## 🔧 Troubleshooting
### Common Issues
**1. GPU Out of Memory**
```bash
# Solution: Reduce batch size
python3 src/train_model.py --batch-size 4
```
**2. Training Too Slow**
```bash
# Solution: Increase batch size and workers (if GPU allows)
python3 src/train_model.py --batch-size 16 --num-workers 8
```
**3. Poor Model Performance**
- Check keyword quality and consistency
- Increase training epochs
- Verify image quality and variety
**4. Model Not Loading**
```bash
# Check if model path exists
ls -la models/agricultural_blip/best_model/
```
## 📈 Performance Expectations
### After Training on 30,000 Photos
- **Keyword Accuracy**: 80-90% relevant keywords
- **Agricultural Distinctions**: Improved farmer vs rancher detection
- **Domain Specificity**: Better recognition of agricultural terms
- **Processing Speed**: Same as pre-trained model (~3 seconds/image)
### Validation Metrics
- **Training Loss**: Should decrease over epochs
- **Validation Loss**: Should decrease and stabilize
- **Early Stopping**: Prevents overfitting
## 🚀 Production Deployment
### Using Trained Model
```bash
# Replace pre-trained model with your custom model
python3 src/main.py \
--input data/raw \
--output outputs \
--model-path models/agricultural_blip/best_model
```
### Model Sharing
Your trained model can be shared by copying:
```
models/agricultural_blip/best_model/
├── config.json
├── pytorch_model.bin
├── preprocessor_config.json
├── tokenizer.json
├── tokenizer_config.json
└── training_state.pt
```
## 📋 Training Checklist
- [ ] **Hardware**: GPU with 8GB+ VRAM available
- [ ] **Data**: 30,000 photos organized in data/training/
- [ ] **Metadata**: CSV file with filename and keywords columns
- [ ] **Dependencies**: Training packages installed
- [ ] **Storage**: 50GB+ free space
- [ ] **Time**: 6-12 hours available for training
- [ ] **Monitoring**: Training logs being tracked
## 🎯 Next Steps
1. **Prepare your 30,000 photo dataset**
2. **Create metadata.csv with keywords**
3. **Run training script**
4. **Evaluate trained model performance**
5. **Deploy for production use**
---
**Ready to train?** Start with sample data to test the pipeline, then scale to your full 30,000 photo dataset!