recycling-project-solutions/README.md

# DS Task Recycling Project

This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images.

## Project Overview

- **Input Types:**
  - Image upload via the Flask API
  - A hardcoded test image (memory_out19.png) for testing purposes

- **Dataset:**
  - 20 pictures of motherboards with memory
  - 20 pictures of motherboards without memory

- **Output:**
  - An annotated image with bounding boxes around each detected memory module
  - For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn

- **Annotation Tool:**
  - [makesense.ai](https://www.makesense.ai/) was used for manual annotation

## Implementation Details

### Algorithm Choice & Rationale

1. **Which algorithm was chosen?**
   - YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task

2. **Why this algorithm?**
   - Fast inference speed suitable for real-time applications
   - Good balance between accuracy and computational requirements
   - Built-in support for transfer learning
   - Excellent performance on object detection tasks
   - Easy integration with Python/Flask applications
   - Robust community support and documentation

### Hardware Considerations

3. **CPU/GPU Impact:**
   - The current implementation runs on CPU for broader accessibility
   - Model parameters were optimized for CPU performance:
     - Reduced batch size (8)
     - Lightweight augmentation
     - Early stopping with patience=15
   - GPU support is available through YOLO if needed for scaling
   - Current performance is suitable for the demo nature of the project

### Video Processing Approach

4. **Handling Video Input:**
   - While not currently implemented, video processing would involve:
     - Frame extraction
     - Batch processing of frames
     - Real-time detection using YOLO's video processing capabilities
     - Optional frame skipping for performance optimization
   - The current architecture can be extended for video by:
     - Adding a video upload endpoint
     - Implementing frame-by-frame processing
     - Returning annotated video or real-time stream

## API Implementation

### Endpoints

1. **Image Upload (`/detect`):**
   ```http
   POST /detect
   Content-Type: multipart/form-data
   ```
   - Accepts image uploads
   - Returns annotated image with detection boxes

2. **Test Detection (`/detect/test`):**
   ```http
   GET /detect/test
   ```
   - Uses a hardcoded test image (memory_out19.png)
   - Returns annotated image with detection boxes

### Processing Workflow

1. Image Reception:
   - Via file upload or hardcoded test image
2. Detection:
   - YOLOv8 processes the image
   - Confidence threshold: 0.25
   - IoU threshold: 0.45
3. Annotation:
   - Bounding boxes drawn around detected modules
4. Response:
   - Annotated image returned in PNG format

## Model Training

The model was trained with the following parameters:
- 50 epochs
- Image size: 640x640
- Batch size: 8
- Early stopping patience: 15
- Augmentations:
  - Rotation (±5°)
  - Scale (0.5)
  - Translation (0.1)
  - Horizontal flip (0.5)
  - Mosaic (1.0)

## Dataset Preparation

```bash
training/
├── memory/
│   └── (images with memory modules) #You have this
├── no_memory/
│   └── (images without memory modules) #You have this as well
├── train/
│   ├── images/
│   │   ├── memory_*.png
│   │   └── no_memory_*.png
│   └── labels/
│       ├── memory_*.txt
│       └── no_memory_*.txt
└── val/
    ├── images/
    │   ├── memory_*.png
    │   └── no_memory_*.png
    └── labels/
        ├── memory_*.txt
        └── no_memory_*.txt

dataset.yaml
```

The dataset is organized as follows:
- `training/memory/`: Source directory for images with memory modules
- `training/no_memory/`: Source directory for images without memory modules
- `training/train/`: Training dataset
  - `images/`: Contains both memory and no-memory images with appropriate prefixes
  - `labels/`: Contains YOLO format annotation files
- `training/val/`: Validation dataset
  - `images/`: Contains both memory and no-memory images with appropriate prefixes
  - `labels/`: Contains YOLO format annotation files

The `dataset.yaml` file contains:
```yaml
path: training  # dataset root dir
train: train/images  # train images
val: val/images    # validation images
nc: 1  # number of classes
names: ['memory_module']  # class names
```

## Getting Started

1. Clone the repository:
   ```bash
   git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git
   ```
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```
3. Prepare the dataset:
   ```bash
   python prepare_dataset.py
   ```
4. Train the model (if not already trained):
   ```bash
   python train.py
   ```
5. Run the Flask application:
   ```bash
   python run.py
   ```
6. Access the web interface at `http://localhost:5000`

## Testing

The project includes comprehensive tests for the detector:
- Batch detection testing
- Threshold optimization
- Various confidence/IoU threshold combinations

Run tests with:
```bash
pytest tests/
```

## Future Improvements

1. GPU support for faster processing
2. Video input support
3. Real-time streaming capabilities
4. More sophisticated augmentation techniques
5. Model quantization for improved CPU performance