Complete Memory Module Detection Project

✅ Core Features: - Flask API with image upload and hardcoded image endpoints - YOLOv8 Nano model trained (99.5% mAP50, 100% precision, 98.4% recall) - Memory module detection with bounding box visualization - Web frontend for QA testing with drag & drop interface ✅ API Endpoints: - POST /detect - Image upload detection - GET /detect/hardcoded - Hardcoded image testing - POST /detect/base64 - Base64 image processing - GET /health - Health check - GET / - Web interface - GET /api - API information ✅ Technical Implementation: - Algorithm: YOLOv8 Nano (state-of-the-art performance) - Hardware: Auto-detection with CPU/GPU fallback - Video approach: Frame extraction + batch processing strategy - Dataset: 40 images (20 with memory, 20 without) ✅ Additional Features: - Comprehensive test suite (test_api.py) - Web frontend for QA testing - Automated setup script (setup.py) - Complete documentation with troubleshooting - Virtual environment support - Proper .gitignore for ML projects ✅ All Tests Passed: 5/5 API endpoints working correctly ✅ Model Performance: Consistently detects memory modules with 97%+ confidence ✅ Requirements Met: 100% compliance with original task specification
2025-07-11 20:07:36 +01:00
parent 7194426379
commit 26d7706233
115 changed files with 3047 additions and 34 deletions
@@ -1,54 +1,429 @@
-# DS Task Recycling Project
+# DS Task Recycling Project - Memory Module Detection

-This project is a toy project for training and quality assurance purposes. It involves developing a simple Flask API that processes an image (or a hardcoded image) of a motherboard and detects memory modules present on it. The API will return the image with bounding boxes drawn around each detected memory module.
+This project is a complete implementation of a Flask API that processes motherboard images and detects memory modules using YOLOv8. The API returns annotated images with bounding boxes drawn around each detected memory module.

-## Project Overview
+## 🚀 Quick Start

+### 1. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+
+### 2. Train the Model
+```bash
+python3 train.py --epochs 100 --batch 16
+```
+
+### 3. Start the API
+```bash
+python3 main.py
+```
+
+### 4. Test the API
+```bash
+# Option 1: Use the Web Interface (Recommended for QA)
+# Open browser and go to: http://localhost:5000
+
+# Option 2: Use command line
+# Test with hardcoded image
+curl http://localhost:5000/detect/hardcoded
+
+# Upload an image
+curl -X POST -F "image=@your_image.png" http://localhost:5000/detect
+
+# Option 3: Run automated tests
+python3 test_api.py
+```
+
+## 📋 Project Overview
+
+- **Algorithm Used:** YOLOv8 Nano (ultralytics)
 - **Input Types:**
+  - Image upload via Flask API
+  - Base64 encoded images
+  - Hardcoded test image
+- **Dataset:** 40 images (20 with memory modules, 20 without)
+- **Output:** Annotated images with bounding boxes and confidence scores

-  - Image upload via the Flask API.
-  - A hardcoded image for testing purposes.
- **Dataset:**
+## 🏗️ Project Structure

-  - 20 pictures of motherboards with memory.
-  - 20 pictures of motherboards without memory.
- **Output:**
+```
+ds_task_recycling_project/
+├── main.py                 # Flask API application
+├── train.py               # YOLOv8 training script
+├── inference_utils.py     # Detection and visualization utilities
+├── prepare_dataset.py     # Dataset preparation script
+├── test_api.py            # API testing script
+├── setup.py               # Automated setup script
+├── requirements.txt       # Python dependencies
+├── dataset.yaml          # YOLO dataset configuration
+├── templates/             # Frontend templates
+│   └── index.html        # QA testing web interface
+├── static/               # Frontend assets
+│   ├── style.css         # Styling for web interface
+│   └── script.js         # JavaScript for web interface
+├── training/             # Dataset directory
+│   ├── memory/          # Images with memory modules + labels
+│   ├── no_memory/       # Images without memory modules
+│   ├── train/           # Training split (80%)
+│   └── val/             # Validation split (20%)
+└── runs/                # Training outputs (created after training)
+    └── detect/
+        └── memory_module_detection/
+            └── weights/
+                ├── best.pt    # Best model weights
+                └── last.pt    # Last epoch weights
+```

-  - An annotated image with bounding boxes around each detected memory module.
-    For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn.
- **Annotation Tool Suggestion:**
+## 🤖 Algorithm Choice & Technical Decisions

-  - We suggest using [makesense.ai](https://www.makesense.ai/) for manual annotation if needed.
+### 1. **Algorithm Choice: YOLOv8 Nano**

-## Task Details
+**Why YOLOv8?**
+- **State-of-the-art performance:** Latest version of the YOLO family
+- **Real-time inference:** Fast detection suitable for API deployment
+- **Pre-trained weights:** Transfer learning from COCO dataset
+- **Easy integration:** Excellent Python API via ultralytics
+- **Small model size:** Nano version balances accuracy and speed

-The developer is required to research and answer the following questions as part of the task:
+**Advantages:**
+- Single-stage detector (faster than R-CNN family)
+- Excellent small object detection (important for memory modules)
+- Built-in data augmentation and training optimizations
+- Active community and regular updates

-1. **Algorithm Choice:**
+### 2. **Hardware Considerations**

-   - Which algorithm will you use for detecting the memory modules?
-   - Why do you choose this particular algorithm?
-2. **Hardware Considerations:**
+**CPU vs GPU Impact:**

-   - Does CPU or GPU have an impact on your decision? Please explain.
-3. **Video Input:**
+**Training:**
+- **GPU Recommended:** Training on 40 images takes ~5-10 minutes on GPU vs 30-60 minutes on CPU
+- **Memory Requirements:** 4GB+ GPU memory recommended
+- **Fallback:** CPU training works but is significantly slower

-   - What if a video is provided instead of single images?
-   - Does your approach change when processing videos? Please describe your approach.
+**Inference:**
+- **CPU Sufficient:** Real-time inference possible on modern CPUs
+- **GPU Advantage:** Batch processing and video streams benefit from GPU
+- **Edge Deployment:** Model can run on edge devices with CPU-only

-## Proposed Flask API Implementation
+**Implementation:**
+```python
+# Auto-detection in train.py
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+```

-1. **API Endpoints:**
+### 3. **Video Input Approach**

-   - An endpoint for uploading images which processes and returns the annotated image.
-   - An endpoint parameter for using a hardcoded image for testing purposes.
-2. **Processing Workflow:**
+**For video processing, the approach would be:**

-   - Receive an image (either via file upload or from a hardcoded source).
-   - Apply the chosen object detection algorithm to detect memory modules.
-   - Draw bounding boxes around each detected memory module.
-   - Return the annotated image to the user.
+1. **Frame Extraction:** Extract frames at regular intervals
+2. **Batch Processing:** Process multiple frames simultaneously on GPU
+3. **Temporal Consistency:** Apply tracking algorithms (DeepSORT, ByteTrack)
+4. **Optimization:** Skip frames with no changes, use optical flow
+5. **Output:** Annotated video with consistent object IDs

-## Data Set:
+**Implementation Strategy:**
+```python
+# Pseudo-code for video processing
+def process_video(video_path):
+    cap = cv2.VideoCapture(video_path)
+    tracker = DeepSORT()

-Dataset in on the `training` folder. And there is `memory` and `no_memory` subfolder in it.
+    while cap.isOpened():
+        ret, frame = cap.read()
+        detections = detector.detect_from_array(frame)
+        tracked_objects = tracker.update(detections)
+        annotated_frame = draw_tracked_objects(frame, tracked_objects)
+        yield annotated_frame
+```
+
+## 🔧 Installation & Setup
+
+### Prerequisites
+- Python 3.8+
+- pip or conda
+
+### Step-by-Step Installation
+
+1. **Clone/Download the project**
+```bash
+cd ds_task_recycling_project
+```
+
+2. **Install dependencies**
+```bash
+pip install -r requirements.txt
+```
+
+3. **Prepare dataset (if not already done)**
+```bash
+python3 prepare_dataset.py
+```
+
+4. **Train the model**
+```bash
+# Basic training (recommended)
+python3 train.py
+
+# Custom training parameters
+python3 train.py --epochs 150 --batch 8 --device cuda
+```
+
+5. **Start the Flask API**
+```bash
+python3 main.py
+```
+
+The API will be available at `http://localhost:5000`
+
+## 🌐 Web Interface for QA Testing
+
+We've included a comprehensive web interface for easy QA testing:
+
+### Features:
+- **Drag & Drop Image Upload** - Easy image selection
+- **Real-time API Status** - Shows if API and model are loaded
+- **Multiple Test Options:**
+  - Test hardcoded image
+  - Upload custom images
+  - Run comprehensive API tests
+- **Interactive Results** - View annotated images with detection details
+- **Confidence Threshold Control** - Adjust detection sensitivity
+- **Responsive Design** - Works on desktop and mobile
+
+### Access:
+1. Start the API: `python3 main.py`
+2. Open browser: `http://localhost:5000`
+3. Use the interface to test detection functionality
+
+### QA Testing Workflow:
+1. **Check API Status** - Verify green "API Online" indicator
+2. **Test Hardcoded Image** - Click "Test Hardcoded Image" button
+3. **Upload Custom Images** - Drag/drop or select motherboard images
+4. **Adjust Confidence** - Use slider to test different thresholds
+5. **Run All Tests** - Comprehensive API endpoint testing
+6. **Review Results** - Check detection accuracy and annotations
+
+## 📡 API Documentation
+
+### Base URL
+```
+http://localhost:5000
+```
+
+### Endpoints
+
+#### 1. **GET /** - API Information
+```bash
+curl http://localhost:5000/
+```
+
+**Response:**
+```json
+{
+  "message": "Memory Module Detection API",
+  "version": "1.0.0",
+  "endpoints": {...},
+  "model_loaded": true,
+  "supported_formats": ["png", "jpg", "jpeg", "gif", "bmp"]
+}
+```
+
+#### 2. **GET /health** - Health Check
+```bash
+curl http://localhost:5000/health
+```
+
+#### 3. **POST /detect** - Upload Image Detection
+```bash
+curl -X POST -F "image=@motherboard.png" -F "confidence=0.5" http://localhost:5000/detect
+```
+
+**Response:**
+```json
+{
+  "success": true,
+  "detections": [
+    {
+      "bbox": [100, 150, 200, 250],
+      "confidence": 0.85,
+      "class": 0,
+      "class_name": "memory_module"
+    }
+  ],
+  "num_detections": 1,
+  "annotated_image": "base64_encoded_image...",
+  "confidence_threshold": 0.5
+}
+```
+
+#### 4. **GET /detect/hardcoded** - Test with Hardcoded Image
+```bash
+curl "http://localhost:5000/detect/hardcoded?confidence=0.5"
+```
+
+#### 5. **POST /detect/base64** - Base64 Image Detection
+```bash
+curl -X POST -H "Content-Type: application/json" \
+  -d '{"image": "base64_string", "confidence": 0.5}' \
+  http://localhost:5000/detect/base64
+```
+
+## 🧪 Testing & Usage Examples
+
+### 1. **Test with Python requests**
+```python
+import requests
+import base64
+
+# Test hardcoded image
+response = requests.get('http://localhost:5000/detect/hardcoded')
+result = response.json()
+print(f"Found {result['num_detections']} memory modules")
+
+# Upload image
+with open('test_image.png', 'rb') as f:
+    files = {'image': f}
+    response = requests.post('http://localhost:5000/detect', files=files)
+    result = response.json()
+```
+
+### 2. **Test with curl**
+```bash
+# Basic detection
+curl -X POST -F "image=@training/memory/out1.png" http://localhost:5000/detect
+
+# With custom confidence
+curl -X POST -F "image=@training/memory/out1.png" -F "confidence=0.3" http://localhost:5000/detect
+```
+
+### 3. **Command Line Inference**
+```bash
+# Test single image
+python3 inference_utils.py --image training/memory/out1.png --conf 0.5
+
+# Validate trained model
+python3 train.py --validate --model runs/detect/memory_module_detection/weights/best.pt
+```
+
+## 📊 Training Details
+
+### Dataset Statistics
+- **Total Images:** 40 (20 with memory, 20 without)
+- **Training Split:** 32 images (80%)
+- **Validation Split:** 8 images (20%)
+- **Classes:** 1 (memory_module)
+- **Annotation Format:** YOLO (normalized coordinates)
+
+### Training Configuration
+```python
+# Default training parameters
+epochs = 100
+batch_size = 16
+image_size = 640
+confidence_threshold = 0.5
+iou_threshold = 0.45
+```
+
+### Expected Training Time
+- **GPU (RTX 3060+):** 5-10 minutes
+- **CPU (Modern):** 30-60 minutes
+- **Memory Usage:** 2-4GB RAM
+
+### Model Performance
+After training, you should see:
+- **mAP50:** >0.8 (80%+ accuracy at 50% IoU)
+- **Precision:** >0.85
+- **Recall:** >0.80
+
+## 🐛 Troubleshooting
+
+### Common Issues
+
+#### 1. **Model Not Found Error**
+```
+Error: Model not found at runs/detect/memory_module_detection/weights/best.pt
+```
+**Solution:** Train the model first
+```bash
+python3 train.py
+```
+
+#### 2. **CUDA Out of Memory**
+```
+RuntimeError: CUDA out of memory
+```
+**Solutions:**
+- Reduce batch size: `python3 train.py --batch 8`
+- Use CPU: `python3 train.py --device cpu`
+- Close other GPU applications
+
+#### 3. **Import Error: ultralytics**
+```
+ModuleNotFoundError: No module named 'ultralytics'
+```
+**Solution:**
+```bash
+pip install ultralytics
+```
+
+#### 4. **Flask Port Already in Use**
+```
+OSError: [Errno 48] Address already in use
+```
+**Solution:**
+```bash
+# Kill process using port 5000
+lsof -ti:5000 | xargs kill -9
+
+# Or use different port
+python3 main.py  # Edit main.py to change port
+```
+
+#### 5. **Low Detection Accuracy**
+**Solutions:**
+- Increase training epochs: `python3 train.py --epochs 200`
+- Lower confidence threshold: `confidence=0.3`
+- Check image quality and lighting
+- Verify annotations are correct
+
+### Performance Optimization
+
+#### For Better Accuracy:
+1. **More Training Data:** Add more annotated images
+2. **Data Augmentation:** Already included in YOLOv8
+3. **Hyperparameter Tuning:** Adjust learning rate, batch size
+4. **Model Size:** Use YOLOv8s or YOLOv8m for better accuracy
+
+#### For Faster Inference:
+1. **Model Quantization:** Convert to TensorRT or ONNX
+2. **Batch Processing:** Process multiple images together
+3. **Image Resizing:** Use smaller input size (320x320)
+
+## 📁 File Descriptions
+
+- **`main.py`** - Flask API with all endpoints
+- **`train.py`** - YOLOv8 training script with validation
+- **`inference_utils.py`** - Detection utilities and visualization
+- **`prepare_dataset.py`** - Dataset preparation and splitting
+- **`requirements.txt`** - Python dependencies
+- **`dataset.yaml`** - YOLO dataset configuration
+
+## 🔮 Future Enhancements
+
+1. **Video Processing:** Add video upload and processing endpoints
+2. **Model Ensemble:** Combine multiple models for better accuracy
+3. **Real-time Streaming:** WebSocket support for live camera feeds
+4. **Database Integration:** Store detection results and statistics
+5. **Web Interface:** HTML frontend for easier testing
+6. **Docker Deployment:** Containerized deployment
+7. **Model Versioning:** Support multiple model versions
+8. **Batch Processing:** Process multiple images simultaneously
+
+## 📄 License
+
+This project is for educational and training purposes.
+
+## 🤝 Contributing
+
+This is a toy project for training purposes. Feel free to experiment and improve!