# DS Task Recycling Project - Memory Module Detection This project is a complete implementation of a Flask API that processes motherboard images and detects memory modules using YOLOv8. The API returns annotated images with bounding boxes drawn around each detected memory module. ## ๐Ÿš€ Quick Start ### 1. Install Dependencies ```bash pip install -r requirements.txt ``` ### 2. Train the Model ```bash python3 train.py --epochs 100 --batch 16 ``` ### 3. Start the API ```bash python3 main.py ``` ### 4. Test the API ```bash # Option 1: Use the Web Interface (Recommended for QA) # Open browser and go to: http://localhost:5000 # Option 2: Use command line # Test with hardcoded image curl http://localhost:5000/detect/hardcoded # Upload an image curl -X POST -F "image=@your_image.png" http://localhost:5000/detect # Option 3: Run automated tests python3 test_api.py ``` ## ๐Ÿ“‹ Project Overview - **Algorithm Used:** YOLOv8 Nano (ultralytics) - **Input Types:** - Image upload via Flask API - Base64 encoded images - Hardcoded test image - **Dataset:** 40 images (20 with memory modules, 20 without) - **Output:** Annotated images with bounding boxes and confidence scores ## ๐Ÿ—๏ธ Project Structure ``` ds_task_recycling_project/ โ”œโ”€โ”€ main.py # Flask API application โ”œโ”€โ”€ train.py # YOLOv8 training script โ”œโ”€โ”€ inference_utils.py # Detection and visualization utilities โ”œโ”€โ”€ prepare_dataset.py # Dataset preparation script โ”œโ”€โ”€ test_api.py # API testing script โ”œโ”€โ”€ setup.py # Automated setup script โ”œโ”€โ”€ requirements.txt # Python dependencies โ”œโ”€โ”€ dataset.yaml # YOLO dataset configuration โ”œโ”€โ”€ templates/ # Frontend templates โ”‚ โ””โ”€โ”€ index.html # QA testing web interface โ”œโ”€โ”€ static/ # Frontend assets โ”‚ โ”œโ”€โ”€ style.css # Styling for web interface โ”‚ โ””โ”€โ”€ script.js # JavaScript for web interface โ”œโ”€โ”€ training/ # Dataset directory โ”‚ โ”œโ”€โ”€ memory/ # Images with memory modules + labels โ”‚ โ”œโ”€โ”€ no_memory/ # Images without memory modules โ”‚ โ”œโ”€โ”€ train/ # Training split (80%) โ”‚ โ””โ”€โ”€ val/ # Validation split (20%) โ””โ”€โ”€ runs/ # Training outputs (created after training) โ””โ”€โ”€ detect/ โ””โ”€โ”€ memory_module_detection/ โ””โ”€โ”€ weights/ โ”œโ”€โ”€ best.pt # Best model weights โ””โ”€โ”€ last.pt # Last epoch weights ``` ## ๐Ÿค– Algorithm Choice & Technical Decisions ### 1. **Algorithm Choice: YOLOv8 Nano** **Why YOLOv8?** - **State-of-the-art performance:** Latest version of the YOLO family - **Real-time inference:** Fast detection suitable for API deployment - **Pre-trained weights:** Transfer learning from COCO dataset - **Easy integration:** Excellent Python API via ultralytics - **Small model size:** Nano version balances accuracy and speed **Advantages:** - Single-stage detector (faster than R-CNN family) - Excellent small object detection (important for memory modules) - Built-in data augmentation and training optimizations - Active community and regular updates ### 2. **Hardware Considerations** **CPU vs GPU Impact:** **Training:** - **GPU Recommended:** Training on 40 images takes ~5-10 minutes on GPU vs 30-60 minutes on CPU - **Memory Requirements:** 4GB+ GPU memory recommended - **Fallback:** CPU training works but is significantly slower **Inference:** - **CPU Sufficient:** Real-time inference possible on modern CPUs - **GPU Advantage:** Batch processing and video streams benefit from GPU - **Edge Deployment:** Model can run on edge devices with CPU-only **Implementation:** ```python # Auto-detection in train.py device = 'cuda' if torch.cuda.is_available() else 'cpu' ``` ### 3. **Video Input Approach** **For video processing, the approach would be:** 1. **Frame Extraction:** Extract frames at regular intervals 2. **Batch Processing:** Process multiple frames simultaneously on GPU 3. **Temporal Consistency:** Apply tracking algorithms (DeepSORT, ByteTrack) 4. **Optimization:** Skip frames with no changes, use optical flow 5. **Output:** Annotated video with consistent object IDs **Implementation Strategy:** ```python # Pseudo-code for video processing def process_video(video_path): cap = cv2.VideoCapture(video_path) tracker = DeepSORT() while cap.isOpened(): ret, frame = cap.read() detections = detector.detect_from_array(frame) tracked_objects = tracker.update(detections) annotated_frame = draw_tracked_objects(frame, tracked_objects) yield annotated_frame ``` ## ๐Ÿ”ง Installation & Setup ### Prerequisites - Python 3.8+ - pip or conda ### Step-by-Step Installation 1. **Clone/Download the project** ```bash cd ds_task_recycling_project ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Prepare dataset (if not already done)** ```bash python3 prepare_dataset.py ``` 4. **Train the model** ```bash # Basic training (recommended) python3 train.py # Custom training parameters python3 train.py --epochs 150 --batch 8 --device cuda ``` 5. **Start the Flask API** ```bash python3 main.py ``` The API will be available at `http://localhost:5000` ## ๐ŸŒ Web Interface for QA Testing We've included a comprehensive web interface for easy QA testing: ### Features: - **Drag & Drop Image Upload** - Easy image selection - **Real-time API Status** - Shows if API and model are loaded - **Multiple Test Options:** - Test hardcoded image - Upload custom images - Run comprehensive API tests - **Interactive Results** - View annotated images with detection details - **Confidence Threshold Control** - Adjust detection sensitivity - **Responsive Design** - Works on desktop and mobile ### Access: 1. Start the API: `python3 main.py` 2. Open browser: `http://localhost:5000` 3. Use the interface to test detection functionality ### QA Testing Workflow: 1. **Check API Status** - Verify green "API Online" indicator 2. **Test Hardcoded Image** - Click "Test Hardcoded Image" button 3. **Upload Custom Images** - Drag/drop or select motherboard images 4. **Adjust Confidence** - Use slider to test different thresholds 5. **Run All Tests** - Comprehensive API endpoint testing 6. **Review Results** - Check detection accuracy and annotations ## ๐Ÿ“ก API Documentation ### Base URL ``` http://localhost:5000 ``` ### Endpoints #### 1. **GET /** - API Information ```bash curl http://localhost:5000/ ``` **Response:** ```json { "message": "Memory Module Detection API", "version": "1.0.0", "endpoints": {...}, "model_loaded": true, "supported_formats": ["png", "jpg", "jpeg", "gif", "bmp"] } ``` #### 2. **GET /health** - Health Check ```bash curl http://localhost:5000/health ``` #### 3. **POST /detect** - Upload Image Detection ```bash curl -X POST -F "image=@motherboard.png" -F "confidence=0.5" http://localhost:5000/detect ``` **Response:** ```json { "success": true, "detections": [ { "bbox": [100, 150, 200, 250], "confidence": 0.85, "class": 0, "class_name": "memory_module" } ], "num_detections": 1, "annotated_image": "base64_encoded_image...", "confidence_threshold": 0.5 } ``` #### 4. **GET /detect/hardcoded** - Test with Hardcoded Image ```bash curl "http://localhost:5000/detect/hardcoded?confidence=0.5" ``` #### 5. **POST /detect/base64** - Base64 Image Detection ```bash curl -X POST -H "Content-Type: application/json" \ -d '{"image": "base64_string", "confidence": 0.5}' \ http://localhost:5000/detect/base64 ``` ## ๐Ÿงช Testing & Usage Examples ### 1. **Test with Python requests** ```python import requests import base64 # Test hardcoded image response = requests.get('http://localhost:5000/detect/hardcoded') result = response.json() print(f"Found {result['num_detections']} memory modules") # Upload image with open('test_image.png', 'rb') as f: files = {'image': f} response = requests.post('http://localhost:5000/detect', files=files) result = response.json() ``` ### 2. **Test with curl** ```bash # Basic detection curl -X POST -F "image=@training/memory/out1.png" http://localhost:5000/detect # With custom confidence curl -X POST -F "image=@training/memory/out1.png" -F "confidence=0.3" http://localhost:5000/detect ``` ### 3. **Command Line Inference** ```bash # Test single image python3 inference_utils.py --image training/memory/out1.png --conf 0.5 # Validate trained model python3 train.py --validate --model runs/detect/memory_module_detection/weights/best.pt ``` ## ๐Ÿ“Š Training Details ### Dataset Statistics - **Total Images:** 40 (20 with memory, 20 without) - **Training Split:** 32 images (80%) - **Validation Split:** 8 images (20%) - **Classes:** 1 (memory_module) - **Annotation Format:** YOLO (normalized coordinates) ### Training Configuration ```python # Default training parameters epochs = 100 batch_size = 16 image_size = 640 confidence_threshold = 0.5 iou_threshold = 0.45 ``` ### Expected Training Time - **GPU (RTX 3060+):** 5-10 minutes - **CPU (Modern):** 30-60 minutes - **Memory Usage:** 2-4GB RAM ### Model Performance After training, you should see: - **mAP50:** >0.8 (80%+ accuracy at 50% IoU) - **Precision:** >0.85 - **Recall:** >0.80 ## ๐Ÿ› Troubleshooting ### Common Issues #### 1. **Model Not Found Error** ``` Error: Model not found at runs/detect/memory_module_detection/weights/best.pt ``` **Solution:** Train the model first ```bash python3 train.py ``` #### 2. **CUDA Out of Memory** ``` RuntimeError: CUDA out of memory ``` **Solutions:** - Reduce batch size: `python3 train.py --batch 8` - Use CPU: `python3 train.py --device cpu` - Close other GPU applications #### 3. **Import Error: ultralytics** ``` ModuleNotFoundError: No module named 'ultralytics' ``` **Solution:** ```bash pip install ultralytics ``` #### 4. **Flask Port Already in Use** ``` OSError: [Errno 48] Address already in use ``` **Solution:** ```bash # Kill process using port 5000 lsof -ti:5000 | xargs kill -9 # Or use different port python3 main.py # Edit main.py to change port ``` #### 5. **Low Detection Accuracy** **Solutions:** - Increase training epochs: `python3 train.py --epochs 200` - Lower confidence threshold: `confidence=0.3` - Check image quality and lighting - Verify annotations are correct ### Performance Optimization #### For Better Accuracy: 1. **More Training Data:** Add more annotated images 2. **Data Augmentation:** Already included in YOLOv8 3. **Hyperparameter Tuning:** Adjust learning rate, batch size 4. **Model Size:** Use YOLOv8s or YOLOv8m for better accuracy #### For Faster Inference: 1. **Model Quantization:** Convert to TensorRT or ONNX 2. **Batch Processing:** Process multiple images together 3. **Image Resizing:** Use smaller input size (320x320) ## ๐Ÿ“ File Descriptions - **`main.py`** - Flask API with all endpoints - **`train.py`** - YOLOv8 training script with validation - **`inference_utils.py`** - Detection utilities and visualization - **`prepare_dataset.py`** - Dataset preparation and splitting - **`requirements.txt`** - Python dependencies - **`dataset.yaml`** - YOLO dataset configuration ## ๐Ÿ”ฎ Future Enhancements 1. **Video Processing:** Add video upload and processing endpoints 2. **Model Ensemble:** Combine multiple models for better accuracy 3. **Real-time Streaming:** WebSocket support for live camera feeds 4. **Database Integration:** Store detection results and statistics 5. **Web Interface:** HTML frontend for easier testing 6. **Docker Deployment:** Containerized deployment 7. **Model Versioning:** Support multiple model versions 8. **Batch Processing:** Process multiple images simultaneously ## ๐Ÿ“„ License This project is for educational and training purposes. ## ๐Ÿค Contributing This is a toy project for training purposes. Feel free to experiment and improve!