recycling-project-solutions/README.md

# DS Task Recycling Project - Memory Module Detection

This project is a complete implementation of a Flask API that processes motherboard images and detects memory modules using YOLOv8. The API returns annotated images with bounding boxes drawn around each detected memory module.

## 🚀 Quick Start

### 1. Install Dependencies
```bash
pip install -r requirements.txt
```

### 2. Train the Model
```bash
python3 train.py --epochs 100 --batch 16
```

### 3. Start the API
```bash
python3 main.py
```

### 4. Test the API
```bash
# Option 1: Use the Web Interface (Recommended for QA)
# Open browser and go to: http://localhost:5000

# Option 2: Use command line
# Test with hardcoded image
curl http://localhost:5000/detect/hardcoded

# Upload an image
curl -X POST -F "image=@your_image.png" http://localhost:5000/detect

# Option 3: Run automated tests
python3 test_api.py
```

## 📋 Project Overview

- **Algorithm Used:** YOLOv8 Nano (ultralytics)
- **Input Types:**
  - Image upload via Flask API
  - Base64 encoded images
  - Hardcoded test image
- **Dataset:** 40 images (20 with memory modules, 20 without)
- **Output:** Annotated images with bounding boxes and confidence scores

## 🏗️ Project Structure

```
ds_task_recycling_project/
├── main.py                 # Flask API application
├── train.py               # YOLOv8 training script
├── inference_utils.py     # Detection and visualization utilities
├── prepare_dataset.py     # Dataset preparation script
├── test_api.py            # API testing script
├── setup.py               # Automated setup script
├── requirements.txt       # Python dependencies
├── dataset.yaml          # YOLO dataset configuration
├── templates/             # Frontend templates
│   └── index.html        # QA testing web interface
├── static/               # Frontend assets
│   ├── style.css         # Styling for web interface
│   └── script.js         # JavaScript for web interface
├── training/             # Dataset directory
│   ├── memory/          # Images with memory modules + labels
│   ├── no_memory/       # Images without memory modules
│   ├── train/           # Training split (80%)
│   └── val/             # Validation split (20%)
└── runs/                # Training outputs (created after training)
    └── detect/
        └── memory_module_detection/
            └── weights/
                ├── best.pt    # Best model weights
                └── last.pt    # Last epoch weights
```

## 🤖 Algorithm Choice & Technical Decisions

### 1. **Algorithm Choice: YOLOv8 Nano**

**Why YOLOv8?**
- **State-of-the-art performance:** Latest version of the YOLO family
- **Real-time inference:** Fast detection suitable for API deployment
- **Pre-trained weights:** Transfer learning from COCO dataset
- **Easy integration:** Excellent Python API via ultralytics
- **Small model size:** Nano version balances accuracy and speed

**Advantages:**
- Single-stage detector (faster than R-CNN family)
- Excellent small object detection (important for memory modules)
- Built-in data augmentation and training optimizations
- Active community and regular updates

### 2. **Hardware Considerations**

**CPU vs GPU Impact:**

**Training:**
- **GPU Recommended:** Training on 40 images takes ~5-10 minutes on GPU vs 30-60 minutes on CPU
- **Memory Requirements:** 4GB+ GPU memory recommended
- **Fallback:** CPU training works but is significantly slower

**Inference:**
- **CPU Sufficient:** Real-time inference possible on modern CPUs
- **GPU Advantage:** Batch processing and video streams benefit from GPU
- **Edge Deployment:** Model can run on edge devices with CPU-only

**Implementation:**
```python
# Auto-detection in train.py
device = 'cuda' if torch.cuda.is_available() else 'cpu'
```

### 3. **Video Input Approach**

**For video processing, the approach would be:**

1. **Frame Extraction:** Extract frames at regular intervals
2. **Batch Processing:** Process multiple frames simultaneously on GPU
3. **Temporal Consistency:** Apply tracking algorithms (DeepSORT, ByteTrack)
4. **Optimization:** Skip frames with no changes, use optical flow
5. **Output:** Annotated video with consistent object IDs

**Implementation Strategy:**
```python
# Pseudo-code for video processing
def process_video(video_path):
    cap = cv2.VideoCapture(video_path)
    tracker = DeepSORT()

    while cap.isOpened():
        ret, frame = cap.read()
        detections = detector.detect_from_array(frame)
        tracked_objects = tracker.update(detections)
        annotated_frame = draw_tracked_objects(frame, tracked_objects)
        yield annotated_frame
```

## 🔧 Installation & Setup

### Prerequisites
- Python 3.8+
- pip or conda

### Step-by-Step Installation

1. **Clone/Download the project**
```bash
cd ds_task_recycling_project
```

2. **Install dependencies**
```bash
pip install -r requirements.txt
```

3. **Prepare dataset (if not already done)**
```bash
python3 prepare_dataset.py
```

4. **Train the model**
```bash
# Basic training (recommended)
python3 train.py

# Custom training parameters
python3 train.py --epochs 150 --batch 8 --device cuda
```

5. **Start the Flask API**
```bash
python3 main.py
```

The API will be available at `http://localhost:5000`

## 🌐 Web Interface for QA Testing

We've included a comprehensive web interface for easy QA testing:

### Features:
- **Drag & Drop Image Upload** - Easy image selection
- **Real-time API Status** - Shows if API and model are loaded
- **Multiple Test Options:**
  - Test hardcoded image
  - Upload custom images
  - Run comprehensive API tests
- **Interactive Results** - View annotated images with detection details
- **Confidence Threshold Control** - Adjust detection sensitivity
- **Responsive Design** - Works on desktop and mobile

### Access:
1. Start the API: `python3 main.py`
2. Open browser: `http://localhost:5000`
3. Use the interface to test detection functionality

### QA Testing Workflow:
1. **Check API Status** - Verify green "API Online" indicator
2. **Test Hardcoded Image** - Click "Test Hardcoded Image" button
3. **Upload Custom Images** - Drag/drop or select motherboard images
4. **Adjust Confidence** - Use slider to test different thresholds
5. **Run All Tests** - Comprehensive API endpoint testing
6. **Review Results** - Check detection accuracy and annotations

## 📡 API Documentation

### Base URL
```
http://localhost:5000
```

### Endpoints

#### 1. **GET /** - API Information
```bash
curl http://localhost:5000/
```

**Response:**
```json
{
  "message": "Memory Module Detection API",
  "version": "1.0.0",
  "endpoints": {...},
  "model_loaded": true,
  "supported_formats": ["png", "jpg", "jpeg", "gif", "bmp"]
}
```

#### 2. **GET /health** - Health Check
```bash
curl http://localhost:5000/health
```

#### 3. **POST /detect** - Upload Image Detection
```bash
curl -X POST -F "image=@motherboard.png" -F "confidence=0.5" http://localhost:5000/detect
```

**Response:**
```json
{
  "success": true,
  "detections": [
    {
      "bbox": [100, 150, 200, 250],
      "confidence": 0.85,
      "class": 0,
      "class_name": "memory_module"
    }
  ],
  "num_detections": 1,
  "annotated_image": "base64_encoded_image...",
  "confidence_threshold": 0.5
}
```

#### 4. **GET /detect/hardcoded** - Test with Hardcoded Image
```bash
curl "http://localhost:5000/detect/hardcoded?confidence=0.5"
```

#### 5. **POST /detect/base64** - Base64 Image Detection
```bash
curl -X POST -H "Content-Type: application/json" \
  -d '{"image": "base64_string", "confidence": 0.5}' \
  http://localhost:5000/detect/base64
```

## 🧪 Testing & Usage Examples

### 1. **Test with Python requests**
```python
import requests
import base64

# Test hardcoded image
response = requests.get('http://localhost:5000/detect/hardcoded')
result = response.json()
print(f"Found {result['num_detections']} memory modules")

# Upload image
with open('test_image.png', 'rb') as f:
    files = {'image': f}
    response = requests.post('http://localhost:5000/detect', files=files)
    result = response.json()
```

### 2. **Test with curl**
```bash
# Basic detection
curl -X POST -F "image=@training/memory/out1.png" http://localhost:5000/detect

# With custom confidence
curl -X POST -F "image=@training/memory/out1.png" -F "confidence=0.3" http://localhost:5000/detect
```

### 3. **Command Line Inference**
```bash
# Test single image
python3 inference_utils.py --image training/memory/out1.png --conf 0.5

# Validate trained model
python3 train.py --validate --model runs/detect/memory_module_detection/weights/best.pt
```

## 📊 Training Details

### Dataset Statistics
- **Total Images:** 40 (20 with memory, 20 without)
- **Training Split:** 32 images (80%)
- **Validation Split:** 8 images (20%)
- **Classes:** 1 (memory_module)
- **Annotation Format:** YOLO (normalized coordinates)

### Training Configuration
```python
# Default training parameters
epochs = 100
batch_size = 16
image_size = 640
confidence_threshold = 0.5
iou_threshold = 0.45
```

### Expected Training Time
- **GPU (RTX 3060+):** 5-10 minutes
- **CPU (Modern):** 30-60 minutes
- **Memory Usage:** 2-4GB RAM

### Model Performance
After training, you should see:
- **mAP50:** >0.8 (80%+ accuracy at 50% IoU)
- **Precision:** >0.85
- **Recall:** >0.80

## 🐛 Troubleshooting

### Common Issues

#### 1. **Model Not Found Error**
```
Error: Model not found at runs/detect/memory_module_detection/weights/best.pt
```
**Solution:** Train the model first
```bash
python3 train.py
```

#### 2. **CUDA Out of Memory**
```
RuntimeError: CUDA out of memory
```
**Solutions:**
- Reduce batch size: `python3 train.py --batch 8`
- Use CPU: `python3 train.py --device cpu`
- Close other GPU applications

#### 3. **Import Error: ultralytics**
```
ModuleNotFoundError: No module named 'ultralytics'
```
**Solution:**
```bash
pip install ultralytics
```

#### 4. **Flask Port Already in Use**
```
OSError: [Errno 48] Address already in use
```
**Solution:**
```bash
# Kill process using port 5000
lsof -ti:5000 | xargs kill -9

# Or use different port
python3 main.py  # Edit main.py to change port
```

#### 5. **Low Detection Accuracy**
**Solutions:**
- Increase training epochs: `python3 train.py --epochs 200`
- Lower confidence threshold: `confidence=0.3`
- Check image quality and lighting
- Verify annotations are correct

### Performance Optimization

#### For Better Accuracy:
1. **More Training Data:** Add more annotated images
2. **Data Augmentation:** Already included in YOLOv8
3. **Hyperparameter Tuning:** Adjust learning rate, batch size
4. **Model Size:** Use YOLOv8s or YOLOv8m for better accuracy

#### For Faster Inference:
1. **Model Quantization:** Convert to TensorRT or ONNX
2. **Batch Processing:** Process multiple images together
3. **Image Resizing:** Use smaller input size (320x320)

## 📁 File Descriptions

- **`main.py`** - Flask API with all endpoints
- **`train.py`** - YOLOv8 training script with validation
- **`inference_utils.py`** - Detection utilities and visualization
- **`prepare_dataset.py`** - Dataset preparation and splitting
- **`requirements.txt`** - Python dependencies
- **`dataset.yaml`** - YOLO dataset configuration

## 🔮 Future Enhancements

1. **Video Processing:** Add video upload and processing endpoints
2. **Model Ensemble:** Combine multiple models for better accuracy
3. **Real-time Streaming:** WebSocket support for live camera feeds
4. **Database Integration:** Store detection results and statistics
5. **Web Interface:** HTML frontend for easier testing
6. **Docker Deployment:** Containerized deployment
7. **Model Versioning:** Support multiple model versions
8. **Batch Processing:** Process multiple images simultaneously

## 📄 License

This project is for educational and training purposes.

## 🤝 Contributing

This is a toy project for training purposes. Feel free to experiment and improve!