DS Task Recycling Project

This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images.

Project Overview

Input Types:
- Image upload via the Flask API
- A hardcoded test image (memory_out19.png) for testing purposes
Dataset:
- 20 pictures of motherboards with memory
- 20 pictures of motherboards without memory
Output:
- An annotated image with bounding boxes around each detected memory module
- For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn
Annotation Tool:
- makesense.ai was used for manual annotation

Implementation Details

Algorithm Choice & Rationale

Which algorithm was chosen?
- YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task
Why this algorithm?
- Fast inference speed suitable for real-time applications
- Good balance between accuracy and computational requirements
- Built-in support for transfer learning
- Excellent performance on object detection tasks
- Easy integration with Python/Flask applications
- Robust community support and documentation

Hardware Considerations

CPU/GPU Impact:
- The current implementation runs on CPU for broader accessibility
- Model parameters were optimized for CPU performance:
  - Reduced batch size (8)
  - Lightweight augmentation
  - Early stopping with patience=15
- GPU support is available through YOLO if needed for scaling
- Current performance is suitable for the demo nature of the project

Video Processing Approach

Handling Video Input:
- While not currently implemented, video processing would involve:
  - Frame extraction
  - Batch processing of frames
  - Real-time detection using YOLO's video processing capabilities
  - Optional frame skipping for performance optimization
- The current architecture can be extended for video by:
  - Adding a video upload endpoint
  - Implementing frame-by-frame processing
  - Returning annotated video or real-time stream

API Implementation

Endpoints

Image Upload (/detect):
```
POST /detect
Content-Type: multipart/form-data
```
- Accepts image uploads
- Returns annotated image with detection boxes
Test Detection (/detect/test):
```
GET /detect/test
```
- Uses a hardcoded test image (memory_out19.png)
- Returns annotated image with detection boxes

Processing Workflow

Image Reception:
- Via file upload or hardcoded test image
Detection:
- YOLOv8 processes the image
- Confidence threshold: 0.25
- IoU threshold: 0.45
Annotation:
- Bounding boxes drawn around detected modules
Response:
- Annotated image returned in PNG format

Model Training

The model was trained with the following parameters:

50 epochs
Image size: 640x640
Batch size: 8
Early stopping patience: 15
Augmentations:
- Rotation (±5°)
- Scale (0.5)
- Translation (0.1)
- Horizontal flip (0.5)
- Mosaic (1.0)

Dataset Preparation

training/
├── memory/
│   └── (images with memory modules) #You have this 
├── no_memory/
│   └── (images without memory modules) #You have this as well
├── train/
│   ├── images/
│   │   ├── memory_*.png
│   │   └── no_memory_*.png
│   └── labels/
│       ├── memory_*.txt
│       └── no_memory_*.txt
└── val/
    ├── images/
    │   ├── memory_*.png
    │   └── no_memory_*.png
    └── labels/
        ├── memory_*.txt
        └── no_memory_*.txt

dataset.yaml

The dataset is organized as follows:

training/memory/: Source directory for images with memory modules
training/no_memory/: Source directory for images without memory modules
training/train/: Training dataset
- images/: Contains both memory and no-memory images with appropriate prefixes
- labels/: Contains YOLO format annotation files
training/val/: Validation dataset
- images/: Contains both memory and no-memory images with appropriate prefixes
- labels/: Contains YOLO format annotation files

The dataset.yaml file contains:

path: training  # dataset root dir
train: train/images  # train images
val: val/images    # validation images
nc: 1  # number of classes
names: ['memory_module']  # class names

Getting Started

Clone the repository:

git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git

Install dependencies:
```
pip install -r requirements.txt
```
Prepare the dataset:
```
python prepare_dataset.py
```
Train the model (if not already trained):
```
python train.py
```
Run the Flask application:
```
python run.py
```
Access the web interface at http://localhost:5000

Testing

The project includes comprehensive tests for the detector:

Batch detection testing
Threshold optimization
Various confidence/IoU threshold combinations

Run tests with:

pytest tests/

Future Improvements

GPU support for faster processing
Video input support
Real-time streaming capabilities
More sophisticated augmentation techniques
Model quantization for improved CPU performance

5.4 KiB Raw Blame History