Files
recycling-project-solutions/README.md
Aherobo Ovie Victor 7908b94d40 update
2025-07-21 19:20:44 +01:00

195 lines
5.4 KiB
Markdown

# DS Task Recycling Project
This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images.
## Project Overview
- **Input Types:**
- Image upload via the Flask API
- A hardcoded test image (memory_out19.png) for testing purposes
- **Dataset:**
- 20 pictures of motherboards with memory
- 20 pictures of motherboards without memory
- **Output:**
- An annotated image with bounding boxes around each detected memory module
- For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn
- **Annotation Tool:**
- [makesense.ai](https://www.makesense.ai/) was used for manual annotation
## Implementation Details
### Algorithm Choice & Rationale
1. **Which algorithm was chosen?**
- YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task
2. **Why this algorithm?**
- Fast inference speed suitable for real-time applications
- Good balance between accuracy and computational requirements
- Built-in support for transfer learning
- Excellent performance on object detection tasks
- Easy integration with Python/Flask applications
- Robust community support and documentation
### Hardware Considerations
3. **CPU/GPU Impact:**
- The current implementation runs on CPU for broader accessibility
- Model parameters were optimized for CPU performance:
- Reduced batch size (8)
- Lightweight augmentation
- Early stopping with patience=15
- GPU support is available through YOLO if needed for scaling
- Current performance is suitable for the demo nature of the project
### Video Processing Approach
4. **Handling Video Input:**
- While not currently implemented, video processing would involve:
- Frame extraction
- Batch processing of frames
- Real-time detection using YOLO's video processing capabilities
- Optional frame skipping for performance optimization
- The current architecture can be extended for video by:
- Adding a video upload endpoint
- Implementing frame-by-frame processing
- Returning annotated video or real-time stream
## API Implementation
### Endpoints
1. **Image Upload (`/detect`):**
```http
POST /detect
Content-Type: multipart/form-data
```
- Accepts image uploads
- Returns annotated image with detection boxes
2. **Test Detection (`/detect/test`):**
```http
GET /detect/test
```
- Uses a hardcoded test image (memory_out19.png)
- Returns annotated image with detection boxes
### Processing Workflow
1. Image Reception:
- Via file upload or hardcoded test image
2. Detection:
- YOLOv8 processes the image
- Confidence threshold: 0.25
- IoU threshold: 0.45
3. Annotation:
- Bounding boxes drawn around detected modules
4. Response:
- Annotated image returned in PNG format
## Model Training
The model was trained with the following parameters:
- 50 epochs
- Image size: 640x640
- Batch size: 8
- Early stopping patience: 15
- Augmentations:
- Rotation (±5°)
- Scale (0.5)
- Translation (0.1)
- Horizontal flip (0.5)
- Mosaic (1.0)
## Dataset Preparation
```bash
training/
├── memory/
│ └── (images with memory modules) #You have this
├── no_memory/
│ └── (images without memory modules) #You have this as well
├── train/
│ ├── images/
│ │ ├── memory_*.png
│ │ └── no_memory_*.png
│ └── labels/
│ ├── memory_*.txt
│ └── no_memory_*.txt
└── val/
├── images/
│ ├── memory_*.png
│ └── no_memory_*.png
└── labels/
├── memory_*.txt
└── no_memory_*.txt
dataset.yaml
```
The dataset is organized as follows:
- `training/memory/`: Source directory for images with memory modules
- `training/no_memory/`: Source directory for images without memory modules
- `training/train/`: Training dataset
- `images/`: Contains both memory and no-memory images with appropriate prefixes
- `labels/`: Contains YOLO format annotation files
- `training/val/`: Validation dataset
- `images/`: Contains both memory and no-memory images with appropriate prefixes
- `labels/`: Contains YOLO format annotation files
The `dataset.yaml` file contains:
```yaml
path: training # dataset root dir
train: train/images # train images
val: val/images # validation images
nc: 1 # number of classes
names: ['memory_module'] # class names
```
## Getting Started
1. Clone the repository:
```bash
git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Prepare the dataset:
```bash
python prepare_dataset.py
```
4. Train the model (if not already trained):
```bash
python train.py
```
5. Run the Flask application:
```bash
python run.py
```
6. Access the web interface at `http://localhost:5000`
## Testing
The project includes comprehensive tests for the detector:
- Batch detection testing
- Threshold optimization
- Various confidence/IoU threshold combinations
Run tests with:
```bash
pytest tests/
```
## Future Improvements
1. GPU support for faster processing
2. Video input support
3. Real-time streaming capabilities
4. More sophisticated augmentation techniques
5. Model quantization for improved CPU performance