2025-07-21 19:20:44 +01:00
# DS Task Recycling Project
2025-03-19 02:24:35 +06:00
2025-07-21 19:20:44 +01:00
This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images.
2025-03-19 02:24:35 +06:00
2025-07-21 19:20:44 +01:00
## Project Overview
2025-03-19 02:24:35 +06:00
- **Input Types:**
2025-07-21 19:20:44 +01:00
- Image upload via the Flask API
- A hardcoded test image (memory_out19.png) for testing purposes
- **Dataset:**
- 20 pictures of motherboards with memory
- 20 pictures of motherboards without memory
- **Output:**
- An annotated image with bounding boxes around each detected memory module
- For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn
- **Annotation Tool:**
- [makesense.ai ](https://www.makesense.ai/ ) was used for manual annotation
## Implementation Details
### Algorithm Choice & Rationale
1. **Which algorithm was chosen? **
- YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task
2. **Why this algorithm? **
- Fast inference speed suitable for real-time applications
- Good balance between accuracy and computational requirements
- Built-in support for transfer learning
- Excellent performance on object detection tasks
- Easy integration with Python/Flask applications
- Robust community support and documentation
### Hardware Considerations
3. **CPU/GPU Impact: **
- The current implementation runs on CPU for broader accessibility
- Model parameters were optimized for CPU performance:
- Reduced batch size (8)
- Lightweight augmentation
- Early stopping with patience=15
- GPU support is available through YOLO if needed for scaling
- Current performance is suitable for the demo nature of the project
### Video Processing Approach
4. **Handling Video Input: **
- While not currently implemented, video processing would involve:
- Frame extraction
- Batch processing of frames
- Real-time detection using YOLO's video processing capabilities
- Optional frame skipping for performance optimization
- The current architecture can be extended for video by:
- Adding a video upload endpoint
- Implementing frame-by-frame processing
- Returning annotated video or real-time stream
## API Implementation
2025-07-11 20:07:36 +01:00
### Endpoints
2025-07-21 19:20:44 +01:00
1. **Image Upload (`/detect`): **
```http
POST /detect
Content-Type: multipart/form-data
` ``
- Accepts image uploads
- Returns annotated image with detection boxes
2. **Test Detection (` /detect/test`):**
` ``http
GET /detect/test
` ``
- Uses a hardcoded test image (memory_out19.png)
- Returns annotated image with detection boxes
### Processing Workflow
1. Image Reception:
- Via file upload or hardcoded test image
2. Detection:
- YOLOv8 processes the image
- Confidence threshold: 0.25
- IoU threshold: 0.45
3. Annotation:
- Bounding boxes drawn around detected modules
4. Response:
- Annotated image returned in PNG format
## Model Training
The model was trained with the following parameters:
- 50 epochs
- Image size: 640x640
- Batch size: 8
- Early stopping patience: 15
- Augmentations:
- Rotation (±5°)
- Scale (0.5)
- Translation (0.1)
- Horizontal flip (0.5)
- Mosaic (1.0)
## Dataset Preparation
` ``bash
training/
├── memory/
│ └── (images with memory modules) #You have this
├── no_memory/
│ └── (images without memory modules) #You have this as well
├── train/
│ ├── images/
│ │ ├── memory_*.png
│ │ └── no_memory_*.png
│ └── labels/
│ ├── memory_*.txt
│ └── no_memory_*.txt
└── val/
├── images/
│ ├── memory_*.png
│ └── no_memory_*.png
└── labels/
├── memory_*.txt
└── no_memory_*.txt
dataset.yaml
` ``
The dataset is organized as follows:
- ` training/memory/`: Source directory for images with memory modules
- ` training/no_memory/`: Source directory for images without memory modules
- ` training/train/`: Training dataset
- ` images/`: Contains both memory and no-memory images with appropriate prefixes
- ` labels/`: Contains YOLO format annotation files
- ` training/val/`: Validation dataset
- ` images/`: Contains both memory and no-memory images with appropriate prefixes
- ` labels/`: Contains YOLO format annotation files
The ` dataset.yaml` file contains:
` ``yaml
path: training # dataset root dir
train: train/images # train images
val: val/images # validation images
nc: 1 # number of classes
names: ['memory_module'] # class names
` ``
## Getting Started
1. Clone the repository:
` ``bash
git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git
` ``
2. Install dependencies:
` ``bash
pip install -r requirements.txt
` ``
3. Prepare the dataset:
` ``bash
python prepare_dataset.py
` ``
4. Train the model (if not already trained):
` ``bash
python train.py
` ``
5. Run the Flask application:
` ``bash
python run.py
` ``
6. Access the web interface at ` http://localhost:5000`
## Testing
The project includes comprehensive tests for the detector:
- Batch detection testing
- Threshold optimization
- Various confidence/IoU threshold combinations
Run tests with:
` ``bash
pytest tests/
` ``
## Future Improvements
1. GPU support for faster processing
2. Video input support
3. Real-time streaming capabilities
4. More sophisticated augmentation techniques
5. Model quantization for improved CPU performance