2025-07-11 20:07:36 +01:00
# DS Task Recycling Project - Memory Module Detection
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
This project is a complete implementation of a Flask API that processes motherboard images and detects memory modules using YOLOv8. The API returns annotated images with bounding boxes drawn around each detected memory module.
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
## 🚀 Quick Start
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
### 1. Install Dependencies
``` bash
pip install -r requirements.txt
```
### 2. Train the Model
``` bash
python3 train.py --epochs 100 --batch 16
```
### 3. Start the API
``` bash
python3 main.py
```
### 4. Test the API
``` bash
# Option 1: Use the Web Interface (Recommended for QA)
# Open browser and go to: http://localhost:5000
# Option 2: Use command line
# Test with hardcoded image
curl http://localhost:5000/detect/hardcoded
# Upload an image
curl -X POST -F "image=@your_image.png" http://localhost:5000/detect
# Option 3: Run automated tests
python3 test_api.py
```
## 📋 Project Overview
- **Algorithm Used:** YOLOv8 Nano (ultralytics)
2025-03-19 02:24:35 +06:00
- **Input Types:**
2025-07-11 20:07:36 +01:00
- Image upload via Flask API
- Base64 encoded images
- Hardcoded test image
- **Dataset:** 40 images (20 with memory modules, 20 without)
- **Output:** Annotated images with bounding boxes and confidence scores
## 🏗️ Project Structure
```
ds_task_recycling_project/
├── main.py # Flask API application
├── train.py # YOLOv8 training script
├── inference_utils.py # Detection and visualization utilities
├── prepare_dataset.py # Dataset preparation script
├── test_api.py # API testing script
├── setup.py # Automated setup script
├── requirements.txt # Python dependencies
├── dataset.yaml # YOLO dataset configuration
├── templates/ # Frontend templates
│ └── index.html # QA testing web interface
├── static/ # Frontend assets
│ ├── style.css # Styling for web interface
│ └── script.js # JavaScript for web interface
├── training/ # Dataset directory
│ ├── memory/ # Images with memory modules + labels
│ ├── no_memory/ # Images without memory modules
│ ├── train/ # Training split (80%)
│ └── val/ # Validation split (20%)
└── runs/ # Training outputs (created after training)
└── detect/
└── memory_module_detection/
└── weights/
├── best.pt # Best model weights
└── last.pt # Last epoch weights
```
## 🤖 Algorithm Choice & Technical Decisions
### 1. **Algorithm Choice: YOLOv8 Nano**
**Why YOLOv8? **
- **State-of-the-art performance:** Latest version of the YOLO family
- **Real-time inference:** Fast detection suitable for API deployment
- **Pre-trained weights:** Transfer learning from COCO dataset
- **Easy integration:** Excellent Python API via ultralytics
- **Small model size:** Nano version balances accuracy and speed
**Advantages: **
- Single-stage detector (faster than R-CNN family)
- Excellent small object detection (important for memory modules)
- Built-in data augmentation and training optimizations
- Active community and regular updates
### 2. **Hardware Considerations**
**CPU vs GPU Impact: **
**Training: **
- **GPU Recommended:** Training on 40 images takes ~5-10 minutes on GPU vs 30-60 minutes on CPU
- **Memory Requirements:** 4GB+ GPU memory recommended
- **Fallback:** CPU training works but is significantly slower
**Inference: **
- **CPU Sufficient:** Real-time inference possible on modern CPUs
- **GPU Advantage:** Batch processing and video streams benefit from GPU
- **Edge Deployment:** Model can run on edge devices with CPU-only
**Implementation: **
``` python
# Auto-detection in train.py
device = ' cuda ' if torch . cuda . is_available ( ) else ' cpu '
```
### 3. **Video Input Approach**
**For video processing, the approach would be: **
1. **Frame Extraction: ** Extract frames at regular intervals
2. **Batch Processing: ** Process multiple frames simultaneously on GPU
3. **Temporal Consistency: ** Apply tracking algorithms (DeepSORT, ByteTrack)
4. **Optimization: ** Skip frames with no changes, use optical flow
5. **Output: ** Annotated video with consistent object IDs
**Implementation Strategy: **
``` python
# Pseudo-code for video processing
def process_video ( video_path ) :
cap = cv2 . VideoCapture ( video_path )
tracker = DeepSORT ( )
while cap . isOpened ( ) :
ret , frame = cap . read ( )
detections = detector . detect_from_array ( frame )
tracked_objects = tracker . update ( detections )
annotated_frame = draw_tracked_objects ( frame , tracked_objects )
yield annotated_frame
```
## 🔧 Installation & Setup
### Prerequisites
- Python 3.8+
- pip or conda
### Step-by-Step Installation
1. **Clone/Download the project **
``` bash
cd ds_task_recycling_project
```
2. **Install dependencies **
``` bash
pip install -r requirements.txt
```
3. **Prepare dataset (if not already done) **
``` bash
python3 prepare_dataset.py
```
4. **Train the model **
``` bash
# Basic training (recommended)
python3 train.py
# Custom training parameters
python3 train.py --epochs 150 --batch 8 --device cuda
```
5. **Start the Flask API **
``` bash
python3 main.py
```
The API will be available at `http://localhost:5000`
## 🌐 Web Interface for QA Testing
We've included a comprehensive web interface for easy QA testing:
### Features:
- **Drag & Drop Image Upload** - Easy image selection
- **Real-time API Status** - Shows if API and model are loaded
- **Multiple Test Options:**
- Test hardcoded image
- Upload custom images
- Run comprehensive API tests
- **Interactive Results** - View annotated images with detection details
- **Confidence Threshold Control** - Adjust detection sensitivity
- **Responsive Design** - Works on desktop and mobile
### Access:
1. Start the API: `python3 main.py`
2. Open browser: `http://localhost:5000`
3. Use the interface to test detection functionality
### QA Testing Workflow:
1. **Check API Status ** - Verify green "API Online" indicator
2. **Test Hardcoded Image ** - Click "Test Hardcoded Image" button
3. **Upload Custom Images ** - Drag/drop or select motherboard images
4. **Adjust Confidence ** - Use slider to test different thresholds
5. **Run All Tests ** - Comprehensive API endpoint testing
6. **Review Results ** - Check detection accuracy and annotations
## 📡 API Documentation
### Base URL
```
http://localhost:5000
```
### Endpoints
#### 1. **GET /** - API Information
``` bash
curl http://localhost:5000/
```
**Response: **
``` json
{
"message" : "Memory Module Detection API" ,
"version" : "1.0.0" ,
"endpoints" : { . . . } ,
"model_loaded" : true ,
"supported_formats" : [ "png" , "jpg" , "jpeg" , "gif" , "bmp" ]
}
```
#### 2. **GET /health** - Health Check
``` bash
curl http://localhost:5000/health
```
#### 3. **POST /detect** - Upload Image Detection
``` bash
curl -X POST -F "image=@motherboard.png" -F "confidence=0.5" http://localhost:5000/detect
```
**Response: **
``` json
{
"success" : true ,
"detections" : [
{
"bbox" : [ 100 , 150 , 200 , 250 ] ,
"confidence" : 0.85 ,
"class" : 0 ,
"class_name" : "memory_module"
}
] ,
"num_detections" : 1 ,
"annotated_image" : "base64_encoded_image..." ,
"confidence_threshold" : 0.5
}
```
#### 4. **GET /detect/hardcoded** - Test with Hardcoded Image
``` bash
curl "http://localhost:5000/detect/hardcoded?confidence=0.5"
```
#### 5. **POST /detect/base64** - Base64 Image Detection
``` bash
curl -X POST -H "Content-Type: application/json" \
-d '{"image": "base64_string", "confidence": 0.5}' \
http://localhost:5000/detect/base64
```
## 🧪 Testing & Usage Examples
### 1. **Test with Python requests**
``` python
import requests
import base64
# Test hardcoded image
response = requests . get ( ' http://localhost:5000/detect/hardcoded ' )
result = response . json ( )
print ( f " Found { result [ ' num_detections ' ] } memory modules " )
# Upload image
with open ( ' test_image.png ' , ' rb ' ) as f :
files = { ' image ' : f }
response = requests . post ( ' http://localhost:5000/detect ' , files = files )
result = response . json ( )
```
### 2. **Test with curl**
``` bash
# Basic detection
curl -X POST -F "image=@training/memory/out1.png" http://localhost:5000/detect
# With custom confidence
curl -X POST -F "image=@training/memory/out1.png" -F "confidence=0.3" http://localhost:5000/detect
```
### 3. **Command Line Inference**
``` bash
# Test single image
python3 inference_utils.py --image training/memory/out1.png --conf 0.5
# Validate trained model
python3 train.py --validate --model runs/detect/memory_module_detection/weights/best.pt
```
## 📊 Training Details
### Dataset Statistics
- **Total Images:** 40 (20 with memory, 20 without)
- **Training Split:** 32 images (80%)
- **Validation Split:** 8 images (20%)
- **Classes:** 1 (memory_module)
- **Annotation Format:** YOLO (normalized coordinates)
### Training Configuration
``` python
# Default training parameters
epochs = 100
batch_size = 16
image_size = 640
confidence_threshold = 0.5
iou_threshold = 0.45
```
### Expected Training Time
- **GPU (RTX 3060+):** 5-10 minutes
- **CPU (Modern):** 30-60 minutes
- **Memory Usage:** 2-4GB RAM
### Model Performance
After training, you should see:
- **mAP50:** >0.8 (80%+ accuracy at 50% IoU)
- **Precision:** >0.85
- **Recall:** >0.80
## 🐛 Troubleshooting
### Common Issues
#### 1. **Model Not Found Error**
```
Error: Model not found at runs/detect/memory_module_detection/weights/best.pt
```
**Solution: ** Train the model first
``` bash
python3 train.py
```
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### 2. **CUDA Out of Memory**
```
RuntimeError: CUDA out of memory
```
**Solutions: **
- Reduce batch size: `python3 train.py --batch 8`
- Use CPU: `python3 train.py --device cpu`
- Close other GPU applications
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### 3. **Import Error: ultralytics**
```
ModuleNotFoundError: No module named 'ultralytics'
```
**Solution: **
``` bash
pip install ultralytics
```
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### 4. **Flask Port Already in Use**
```
OSError: [Errno 48] Address already in use
```
**Solution: **
``` bash
# Kill process using port 5000
lsof -ti:5000 | xargs kill -9
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
# Or use different port
python3 main.py # Edit main.py to change port
```
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### 5. **Low Detection Accuracy**
**Solutions:**
- Increase training epochs: `python3 train.py --epochs 200`
- Lower confidence threshold: `confidence=0.3`
- Check image quality and lighting
- Verify annotations are correct
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
### Performance Optimization
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### For Better Accuracy:
1. **More Training Data: ** Add more annotated images
2. **Data Augmentation: ** Already included in YOLOv8
3. **Hyperparameter Tuning: ** Adjust learning rate, batch size
4. **Model Size: ** Use YOLOv8s or YOLOv8m for better accuracy
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
#### For Faster Inference:
1. **Model Quantization: ** Convert to TensorRT or ONNX
2. **Batch Processing: ** Process multiple images together
3. **Image Resizing: ** Use smaller input size (320x320)
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
## 📁 File Descriptions
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
- **`main.py` ** - Flask API with all endpoints
- **`train.py` ** - YOLOv8 training script with validation
- **`inference_utils.py` ** - Detection utilities and visualization
- **`prepare_dataset.py` ** - Dataset preparation and splitting
- **`requirements.txt` ** - Python dependencies
- **`dataset.yaml` ** - YOLO dataset configuration
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
## 🔮 Future Enhancements
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
1. **Video Processing: ** Add video upload and processing endpoints
2. **Model Ensemble: ** Combine multiple models for better accuracy
3. **Real-time Streaming: ** WebSocket support for live camera feeds
4. **Database Integration: ** Store detection results and statistics
5. **Web Interface: ** HTML frontend for easier testing
6. **Docker Deployment: ** Containerized deployment
7. **Model Versioning: ** Support multiple model versions
8. **Batch Processing: ** Process multiple images simultaneously
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
## 📄 License
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
This project is for educational and training purposes.
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
## 🤝 Contributing
2025-03-19 02:24:35 +06:00
2025-07-11 20:07:36 +01:00
This is a toy project for training purposes. Feel free to experiment and improve!