247 lines
6.7 KiB
Markdown
247 lines
6.7 KiB
Markdown
|
|
# 🚜 Agricultural Photo Keyword Training Guide
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
This guide explains how to train a custom agricultural keyword generation model using your 30,000 tagged photos dataset.
|
||
|
|
|
||
|
|
## 📋 Prerequisites
|
||
|
|
|
||
|
|
### 1. Hardware Requirements
|
||
|
|
- **GPU**: NVIDIA GPU with 8GB+ VRAM (recommended)
|
||
|
|
- **RAM**: 16GB+ system RAM
|
||
|
|
- **Storage**: 50GB+ free space for model and data
|
||
|
|
|
||
|
|
### 2. Software Requirements
|
||
|
|
```bash
|
||
|
|
# Install additional training dependencies
|
||
|
|
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
|
||
|
|
pip install transformers datasets accelerate
|
||
|
|
pip install scikit-learn tqdm
|
||
|
|
```
|
||
|
|
|
||
|
|
## 📁 Data Preparation
|
||
|
|
|
||
|
|
### 1. Organize Your 30,000 Photos
|
||
|
|
```
|
||
|
|
data/training/
|
||
|
|
├── photo_001.jpg
|
||
|
|
├── photo_002.jpg
|
||
|
|
├── ...
|
||
|
|
├── photo_30000.jpg
|
||
|
|
└── metadata.csv
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Create Metadata CSV
|
||
|
|
Your `metadata.csv` should have this format:
|
||
|
|
```csv
|
||
|
|
filename,keywords
|
||
|
|
photo_001.jpg,"farmer, corn, field, agriculture, male, tractor"
|
||
|
|
photo_002.jpg,"dairy cow, barn, livestock, farming, rural"
|
||
|
|
photo_003.jpg,"chicken, poultry, farm, feeding, outdoor"
|
||
|
|
...
|
||
|
|
```
|
||
|
|
|
||
|
|
**Required columns:**
|
||
|
|
- `filename`: Image filename (must exist in data/training/)
|
||
|
|
- `keywords`: Comma-separated keywords for the image
|
||
|
|
|
||
|
|
## 🚀 Training Process
|
||
|
|
|
||
|
|
### Step 1: Prepare Sample Data (Testing)
|
||
|
|
```bash
|
||
|
|
# Create sample data for testing the pipeline
|
||
|
|
python3 src/train_model.py --create-sample --data-dir data/training
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 2: Train on Your 30,000 Photos
|
||
|
|
```bash
|
||
|
|
# Basic training command
|
||
|
|
python3 src/train_model.py \
|
||
|
|
--data-dir data/training \
|
||
|
|
--metadata-file data/training/metadata.csv \
|
||
|
|
--epochs 5 \
|
||
|
|
--batch-size 8 \
|
||
|
|
--learning-rate 5e-5
|
||
|
|
|
||
|
|
# Advanced training with custom settings
|
||
|
|
python3 src/train_model.py \
|
||
|
|
--data-dir data/training \
|
||
|
|
--metadata-file data/training/metadata.csv \
|
||
|
|
--output-dir models/custom_agricultural_model \
|
||
|
|
--epochs 10 \
|
||
|
|
--batch-size 16 \
|
||
|
|
--learning-rate 3e-5 \
|
||
|
|
--val-split 0.15 \
|
||
|
|
--num-workers 8
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 3: Monitor Training
|
||
|
|
Training logs are saved to `models/agricultural_blip/training.log`:
|
||
|
|
```bash
|
||
|
|
# Monitor training progress
|
||
|
|
tail -f models/agricultural_blip/training.log
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 4: Use Trained Model
|
||
|
|
```bash
|
||
|
|
# Use your custom trained model for inference
|
||
|
|
python3 src/main.py \
|
||
|
|
--input data/raw \
|
||
|
|
--output outputs \
|
||
|
|
--model-path models/agricultural_blip/best_model
|
||
|
|
```
|
||
|
|
|
||
|
|
## ⚙️ Training Parameters
|
||
|
|
|
||
|
|
### Key Parameters
|
||
|
|
| Parameter | Default | Description |
|
||
|
|
|-----------|---------|-------------|
|
||
|
|
| `--epochs` | 5 | Number of training epochs |
|
||
|
|
| `--batch-size` | 8 | Training batch size (reduce if GPU memory issues) |
|
||
|
|
| `--learning-rate` | 5e-5 | Learning rate for optimization |
|
||
|
|
| `--val-split` | 0.2 | Fraction of data for validation |
|
||
|
|
| `--num-workers` | 4 | Data loading workers |
|
||
|
|
|
||
|
|
### GPU Memory Optimization
|
||
|
|
If you encounter GPU memory issues:
|
||
|
|
```bash
|
||
|
|
# Reduce batch size
|
||
|
|
python3 src/train_model.py --batch-size 4
|
||
|
|
|
||
|
|
# Use gradient accumulation (simulates larger batch)
|
||
|
|
# This is handled automatically in the training code
|
||
|
|
```
|
||
|
|
|
||
|
|
## 📊 Training Monitoring
|
||
|
|
|
||
|
|
### Training Metrics
|
||
|
|
The training script tracks:
|
||
|
|
- **Training Loss**: How well model fits training data
|
||
|
|
- **Validation Loss**: How well model generalizes
|
||
|
|
- **Learning Rate**: Optimization parameter schedule
|
||
|
|
|
||
|
|
### Expected Training Time
|
||
|
|
- **30,000 photos**: ~6-12 hours on modern GPU
|
||
|
|
- **Batch size 8**: ~45 minutes per epoch
|
||
|
|
- **Early stopping**: Training stops if no improvement
|
||
|
|
|
||
|
|
### Model Checkpoints
|
||
|
|
Models are saved to `models/agricultural_blip/`:
|
||
|
|
- `best_model/`: Best performing model (lowest validation loss)
|
||
|
|
- `final_model/`: Model after all epochs
|
||
|
|
- `checkpoint_epoch_N/`: Intermediate checkpoints
|
||
|
|
|
||
|
|
## 🎯 Training Data Quality
|
||
|
|
|
||
|
|
### Keyword Quality Guidelines
|
||
|
|
For best results, ensure your 30,000 photos have:
|
||
|
|
|
||
|
|
1. **Consistent Keywords**: Use standardized terms
|
||
|
|
- ✅ "farmer" not "farm worker" or "agricultural worker"
|
||
|
|
- ✅ "tractor" not "farm equipment" or "machinery"
|
||
|
|
|
||
|
|
2. **Specific Agricultural Terms**:
|
||
|
|
- ✅ "dairy farmer" vs "rancher" vs "chicken farmer"
|
||
|
|
- ✅ "corn field" vs "wheat field" vs "soybean field"
|
||
|
|
|
||
|
|
3. **5-10 Keywords per Image**: Optimal range for training
|
||
|
|
|
||
|
|
4. **Balanced Dataset**: Include variety of:
|
||
|
|
- Crops (corn, wheat, soy, etc.)
|
||
|
|
- Livestock (cattle, pigs, chickens)
|
||
|
|
- Equipment (tractors, harvesters)
|
||
|
|
- People (farmers, ranchers, workers)
|
||
|
|
- Settings (fields, barns, farms)
|
||
|
|
|
||
|
|
### Data Analysis
|
||
|
|
Before training, analyze your dataset:
|
||
|
|
```bash
|
||
|
|
# The training script will show data analysis
|
||
|
|
python3 src/train_model.py --data-dir data/training --metadata-file data/training/metadata.csv
|
||
|
|
```
|
||
|
|
|
||
|
|
## 🔧 Troubleshooting
|
||
|
|
|
||
|
|
### Common Issues
|
||
|
|
|
||
|
|
**1. GPU Out of Memory**
|
||
|
|
```bash
|
||
|
|
# Solution: Reduce batch size
|
||
|
|
python3 src/train_model.py --batch-size 4
|
||
|
|
```
|
||
|
|
|
||
|
|
**2. Training Too Slow**
|
||
|
|
```bash
|
||
|
|
# Solution: Increase batch size and workers (if GPU allows)
|
||
|
|
python3 src/train_model.py --batch-size 16 --num-workers 8
|
||
|
|
```
|
||
|
|
|
||
|
|
**3. Poor Model Performance**
|
||
|
|
- Check keyword quality and consistency
|
||
|
|
- Increase training epochs
|
||
|
|
- Verify image quality and variety
|
||
|
|
|
||
|
|
**4. Model Not Loading**
|
||
|
|
```bash
|
||
|
|
# Check if model path exists
|
||
|
|
ls -la models/agricultural_blip/best_model/
|
||
|
|
```
|
||
|
|
|
||
|
|
## 📈 Performance Expectations
|
||
|
|
|
||
|
|
### After Training on 30,000 Photos
|
||
|
|
- **Keyword Accuracy**: 80-90% relevant keywords
|
||
|
|
- **Agricultural Distinctions**: Improved farmer vs rancher detection
|
||
|
|
- **Domain Specificity**: Better recognition of agricultural terms
|
||
|
|
- **Processing Speed**: Same as pre-trained model (~3 seconds/image)
|
||
|
|
|
||
|
|
### Validation Metrics
|
||
|
|
- **Training Loss**: Should decrease over epochs
|
||
|
|
- **Validation Loss**: Should decrease and stabilize
|
||
|
|
- **Early Stopping**: Prevents overfitting
|
||
|
|
|
||
|
|
## 🚀 Production Deployment
|
||
|
|
|
||
|
|
### Using Trained Model
|
||
|
|
```bash
|
||
|
|
# Replace pre-trained model with your custom model
|
||
|
|
python3 src/main.py \
|
||
|
|
--input data/raw \
|
||
|
|
--output outputs \
|
||
|
|
--model-path models/agricultural_blip/best_model
|
||
|
|
```
|
||
|
|
|
||
|
|
### Model Sharing
|
||
|
|
Your trained model can be shared by copying:
|
||
|
|
```
|
||
|
|
models/agricultural_blip/best_model/
|
||
|
|
├── config.json
|
||
|
|
├── pytorch_model.bin
|
||
|
|
├── preprocessor_config.json
|
||
|
|
├── tokenizer.json
|
||
|
|
├── tokenizer_config.json
|
||
|
|
└── training_state.pt
|
||
|
|
```
|
||
|
|
|
||
|
|
## 📋 Training Checklist
|
||
|
|
|
||
|
|
- [ ] **Hardware**: GPU with 8GB+ VRAM available
|
||
|
|
- [ ] **Data**: 30,000 photos organized in data/training/
|
||
|
|
- [ ] **Metadata**: CSV file with filename and keywords columns
|
||
|
|
- [ ] **Dependencies**: Training packages installed
|
||
|
|
- [ ] **Storage**: 50GB+ free space
|
||
|
|
- [ ] **Time**: 6-12 hours available for training
|
||
|
|
- [ ] **Monitoring**: Training logs being tracked
|
||
|
|
|
||
|
|
## 🎯 Next Steps
|
||
|
|
|
||
|
|
1. **Prepare your 30,000 photo dataset**
|
||
|
|
2. **Create metadata.csv with keywords**
|
||
|
|
3. **Run training script**
|
||
|
|
4. **Evaluate trained model performance**
|
||
|
|
5. **Deploy for production use**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Ready to train?** Start with sample data to test the pipeline, then scale to your full 30,000 photo dataset!
|