# DS Task Recycling Project This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images. ## Project Overview - **Input Types:** - Image upload via the Flask API - A hardcoded test image (memory_out19.png) for testing purposes - **Dataset:** - 20 pictures of motherboards with memory - 20 pictures of motherboards without memory - **Output:** - An annotated image with bounding boxes around each detected memory module - For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn - **Annotation Tool:** - [makesense.ai](https://www.makesense.ai/) was used for manual annotation ## Implementation Details ### Algorithm Choice & Rationale 1. **Which algorithm was chosen?** - YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task 2. **Why this algorithm?** - Fast inference speed suitable for real-time applications - Good balance between accuracy and computational requirements - Built-in support for transfer learning - Excellent performance on object detection tasks - Easy integration with Python/Flask applications - Robust community support and documentation ### Hardware Considerations 3. **CPU/GPU Impact:** - The current implementation runs on CPU for broader accessibility - Model parameters were optimized for CPU performance: - Reduced batch size (8) - Lightweight augmentation - Early stopping with patience=15 - GPU support is available through YOLO if needed for scaling - Current performance is suitable for the demo nature of the project ### Video Processing Approach 4. **Handling Video Input:** - While not currently implemented, video processing would involve: - Frame extraction - Batch processing of frames - Real-time detection using YOLO's video processing capabilities - Optional frame skipping for performance optimization - The current architecture can be extended for video by: - Adding a video upload endpoint - Implementing frame-by-frame processing - Returning annotated video or real-time stream ## API Implementation ### Endpoints 1. **Image Upload (`/detect`):** ```http POST /detect Content-Type: multipart/form-data ``` - Accepts image uploads - Returns annotated image with detection boxes 2. **Test Detection (`/detect/test`):** ```http GET /detect/test ``` - Uses a hardcoded test image (memory_out19.png) - Returns annotated image with detection boxes ### Processing Workflow 1. Image Reception: - Via file upload or hardcoded test image 2. Detection: - YOLOv8 processes the image - Confidence threshold: 0.25 - IoU threshold: 0.45 3. Annotation: - Bounding boxes drawn around detected modules 4. Response: - Annotated image returned in PNG format ## Model Training The model was trained with the following parameters: - 50 epochs - Image size: 640x640 - Batch size: 8 - Early stopping patience: 15 - Augmentations: - Rotation (±5°) - Scale (0.5) - Translation (0.1) - Horizontal flip (0.5) - Mosaic (1.0) ## Dataset Preparation ```bash training/ ├── memory/ │ └── (images with memory modules) #You have this ├── no_memory/ │ └── (images without memory modules) #You have this as well ├── train/ │ ├── images/ │ │ ├── memory_*.png │ │ └── no_memory_*.png │ └── labels/ │ ├── memory_*.txt │ └── no_memory_*.txt └── val/ ├── images/ │ ├── memory_*.png │ └── no_memory_*.png └── labels/ ├── memory_*.txt └── no_memory_*.txt dataset.yaml ``` The dataset is organized as follows: - `training/memory/`: Source directory for images with memory modules - `training/no_memory/`: Source directory for images without memory modules - `training/train/`: Training dataset - `images/`: Contains both memory and no-memory images with appropriate prefixes - `labels/`: Contains YOLO format annotation files - `training/val/`: Validation dataset - `images/`: Contains both memory and no-memory images with appropriate prefixes - `labels/`: Contains YOLO format annotation files The `dataset.yaml` file contains: ```yaml path: training # dataset root dir train: train/images # train images val: val/images # validation images nc: 1 # number of classes names: ['memory_module'] # class names ``` ## Getting Started 1. Clone the repository: ```bash git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Prepare the dataset: ```bash python prepare_dataset.py ``` 4. Train the model (if not already trained): ```bash python train.py ``` 5. Run the Flask application: ```bash python run.py ``` 6. Access the web interface at `http://localhost:5000` ## Testing The project includes comprehensive tests for the detector: - Batch detection testing - Threshold optimization - Various confidence/IoU threshold combinations Run tests with: ```bash pytest tests/ ``` ## Future Improvements 1. GPU support for faster processing 2. Video input support 3. Real-time streaming capabilities 4. More sophisticated augmentation techniques 5. Model quantization for improved CPU performance