update
This commit is contained in:
@@ -1,545 +1,194 @@
|
||||
# DS Task Recycling Project - Memory Module Detection
|
||||
# DS Task Recycling Project
|
||||
|
||||
This project is a complete implementation of a Flask API that processes motherboard images and detects memory modules using YOLOv8. The API returns annotated images with bounding boxes drawn around each detected memory module.
|
||||
This project is a Flask API that processes images of motherboards to detect memory modules. It uses computer vision to identify and draw bounding boxes around memory modules present in the input images.
|
||||
|
||||
## 🚀 Quick Start
|
||||
## Project Overview
|
||||
|
||||
### 1. Install Dependencies
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 2. Train the Model
|
||||
```bash
|
||||
python3 train.py --epochs 100 --batch 16
|
||||
```
|
||||
|
||||
### 3. Start the API
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
|
||||
### 4. Test the API
|
||||
```bash
|
||||
# Option 1: Use the Web Interface (Recommended for QA)
|
||||
# Open browser and go to: http://localhost:5000
|
||||
|
||||
# Option 2: Use command line
|
||||
# Test with hardcoded image
|
||||
curl http://localhost:5000/detect/hardcoded
|
||||
|
||||
# Upload an image
|
||||
curl -X POST -F "image=@your_image.png" http://localhost:5000/detect
|
||||
|
||||
# Option 3: Run automated tests
|
||||
python3 test_api.py
|
||||
```
|
||||
|
||||
## 📋 Project Overview
|
||||
|
||||
- **Algorithm Used:** YOLOv8 Nano (ultralytics)
|
||||
- **Input Types:**
|
||||
- Image upload via Flask API
|
||||
- Base64 encoded images
|
||||
- Hardcoded test image
|
||||
- **Dataset:** 40 images (20 with memory modules, 20 without)
|
||||
- **Output:** Annotated images with bounding boxes and confidence scores
|
||||
- Image upload via the Flask API
|
||||
- A hardcoded test image (memory_out19.png) for testing purposes
|
||||
|
||||
## 🏗️ Project Structure
|
||||
- **Dataset:**
|
||||
- 20 pictures of motherboards with memory
|
||||
- 20 pictures of motherboards without memory
|
||||
|
||||
```
|
||||
ds_task_recycling_project/
|
||||
├── main.py # Flask API application (main interface)
|
||||
├── api_docs.py # Swagger UI API documentation (developer only)
|
||||
├── train.py # YOLOv8 training script
|
||||
├── inference_utils.py # Detection and visualization utilities
|
||||
├── prepare_dataset.py # Dataset preparation script
|
||||
├── test_api.py # API testing script
|
||||
├── setup.py # Automated setup script
|
||||
├── requirements.txt # Python dependencies
|
||||
├── dataset.yaml # YOLO dataset configuration
|
||||
├── .gitignore # Git ignore file for ML projects
|
||||
├── VALIDATION_CHECKLIST.md # Project validation checklist
|
||||
├── templates/ # Frontend templates
|
||||
│ └── index.html # QA testing web interface
|
||||
├── static/ # Frontend assets
|
||||
│ ├── style.css # Styling for web interface
|
||||
│ └── script.js # JavaScript for web interface
|
||||
├── venv/ # Virtual environment (created by user)
|
||||
├── training/ # Dataset directory
|
||||
│ ├── memory/ # Images with memory modules + YOLO labels
|
||||
│ │ ├── out1.png # Sample motherboard image with memory
|
||||
│ │ ├── out1.txt # YOLO format annotation file
|
||||
│ │ └── ... # 19 more image/label pairs
|
||||
│ ├── no_memory/ # Images without memory modules
|
||||
│ │ ├── out21.png # Sample motherboard image without memory
|
||||
│ │ └── ... # 19 more images (no labels needed)
|
||||
│ ├── train/ # Training split (80% = 32 images)
|
||||
│ │ ├── images/ # Training images
|
||||
│ │ └── labels/ # Training labels
|
||||
│ └── val/ # Validation split (20% = 8 images)
|
||||
│ ├── images/ # Validation images
|
||||
│ └── labels/ # Validation labels
|
||||
├── uploads/ # Temporary upload directory (created at runtime)
|
||||
└── runs/ # Training outputs (created after training)
|
||||
└── detect/
|
||||
└── memory_module_detection/
|
||||
├── weights/
|
||||
│ ├── best.pt # Best model weights
|
||||
│ └── last.pt # Last epoch weights
|
||||
├── train_batch*.jpg # Training visualization
|
||||
├── val_batch*.jpg # Validation visualization
|
||||
├── confusion_matrix.png # Model performance metrics
|
||||
├── results.png # Training curves
|
||||
└── args.yaml # Training arguments
|
||||
```
|
||||
- **Output:**
|
||||
- An annotated image with bounding boxes around each detected memory module
|
||||
- For example, if there are two memory modules, two boxes are drawn; if only one is detected, then one box is drawn
|
||||
|
||||
### **📁 Key Files Description**
|
||||
- **Annotation Tool:**
|
||||
- [makesense.ai](https://www.makesense.ai/) was used for manual annotation
|
||||
|
||||
| File/Directory | Purpose | Usage |
|
||||
|----------------|---------|-------|
|
||||
| `main.py` | Main Flask API application | `python3 main.py` |
|
||||
| `api_docs.py` | Swagger UI documentation (developer only) | `python3 api_docs.py` |
|
||||
| `train.py` | YOLOv8 model training | `python3 train.py` |
|
||||
| `inference_utils.py` | Detection utilities and classes | Imported by other scripts |
|
||||
| `test_api.py` | Comprehensive API testing | `python3 test_api.py` |
|
||||
| `setup.py` | Automated project setup | `python3 setup.py` |
|
||||
| `templates/index.html` | Web interface for QA testing | Served by Flask |
|
||||
| `static/` | CSS, JavaScript, and assets | Served by Flask |
|
||||
| `training/` | Complete dataset with annotations | Used by training script |
|
||||
| `runs/` | Model training outputs | Created after training |
|
||||
| `venv/` | Python virtual environment | Created by user |
|
||||
## Implementation Details
|
||||
|
||||
## 🤖 Algorithm Choice & Technical Decisions
|
||||
### Algorithm Choice & Rationale
|
||||
|
||||
### 1. **Algorithm Choice: YOLOv8 Nano**
|
||||
1. **Which algorithm was chosen?**
|
||||
- YOLOv8 (specifically YOLOv8n - the nano version) was selected for this task
|
||||
|
||||
2. **Why this algorithm?**
|
||||
- Fast inference speed suitable for real-time applications
|
||||
- Good balance between accuracy and computational requirements
|
||||
- Built-in support for transfer learning
|
||||
- Excellent performance on object detection tasks
|
||||
- Easy integration with Python/Flask applications
|
||||
- Robust community support and documentation
|
||||
|
||||
**Which algorithm will you use for detecting the memory modules?**
|
||||
- **Answer:** YOLOv8 Nano (You Only Look Once version 8, Nano variant)
|
||||
### Hardware Considerations
|
||||
|
||||
**Why do you choose this particular algorithm?**
|
||||
3. **CPU/GPU Impact:**
|
||||
- The current implementation runs on CPU for broader accessibility
|
||||
- Model parameters were optimized for CPU performance:
|
||||
- Reduced batch size (8)
|
||||
- Lightweight augmentation
|
||||
- Early stopping with patience=15
|
||||
- GPU support is available through YOLO if needed for scaling
|
||||
- Current performance is suitable for the demo nature of the project
|
||||
|
||||
**Primary Reasons:**
|
||||
- **State-of-the-art performance:** Latest evolution of YOLO family with superior accuracy
|
||||
- **Real-time inference:** 37ms processing time, single-stage detector
|
||||
- **Small object detection:** Excellent at detecting memory modules on motherboards
|
||||
- **Pre-trained weights:** Leverages COCO dataset for transfer learning
|
||||
- **Easy integration:** Ultralytics library with excellent Python API
|
||||
- **Model efficiency:** Nano variant balances 99.5% mAP50 accuracy with speed
|
||||
- **Production ready:** Proven architecture used in industrial applications
|
||||
### Video Processing Approach
|
||||
|
||||
**Technical Advantages:**
|
||||
- **Anchor-free design:** Eliminates anchor box tuning complexity
|
||||
- **Advanced augmentation:** Built-in data augmentation strategies
|
||||
- **Multi-scale detection:** Handles objects of different sizes effectively
|
||||
- **Export flexibility:** ONNX, TensorRT support for deployment optimization
|
||||
- **Active community:** Regular updates and extensive documentation
|
||||
4. **Handling Video Input:**
|
||||
- While not currently implemented, video processing would involve:
|
||||
- Frame extraction
|
||||
- Batch processing of frames
|
||||
- Real-time detection using YOLO's video processing capabilities
|
||||
- Optional frame skipping for performance optimization
|
||||
- The current architecture can be extended for video by:
|
||||
- Adding a video upload endpoint
|
||||
- Implementing frame-by-frame processing
|
||||
- Returning annotated video or real-time stream
|
||||
|
||||
### 2. **Hardware Considerations**
|
||||
|
||||
**Does CPU or GPU have an impact on your decision? Please explain.**
|
||||
|
||||
**Yes, hardware significantly impacts the implementation strategy:**
|
||||
|
||||
**Training Phase:**
|
||||
- **GPU Impact:** Critical for training efficiency
|
||||
- **GPU Training:** 5-10 minutes for 50 epochs (recommended)
|
||||
- **CPU Training:** 30-60 minutes for same epochs
|
||||
- **Memory Requirements:** 4GB+ GPU memory recommended
|
||||
- **Batch Size:** GPU allows larger batches (16-32) vs CPU (4-8)
|
||||
|
||||
**Inference Phase:**
|
||||
- **CPU Performance:** 37ms per image on modern CPU (Intel i5/i7, M1/M2)
|
||||
- **GPU Performance:** 10-15ms per image, better for batch processing
|
||||
- **Memory Usage:** CPU: 2-4GB RAM, GPU: 1-2GB VRAM
|
||||
- **Edge Deployment:** Model runs efficiently on CPU-only devices
|
||||
|
||||
**Decision Impact:**
|
||||
- **Algorithm Choice:** YOLOv8 Nano chosen specifically for CPU compatibility
|
||||
- **Deployment Flexibility:** No expensive GPU required for production
|
||||
- **Cost Efficiency:** Reduces infrastructure costs
|
||||
- **Scalability:** GPU enables high-throughput batch processing
|
||||
|
||||
**Implementation:**
|
||||
```python
|
||||
# Auto-detection with fallback in train.py
|
||||
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
||||
print(f"Using device: {device}")
|
||||
```
|
||||
|
||||
### 3. **Video Input Approach**
|
||||
|
||||
**What if a video is provided instead of single images?**
|
||||
**Does your approach change when processing videos? Please describe your approach.**
|
||||
|
||||
**Yes, the approach would change significantly for video processing:**
|
||||
|
||||
**Video Processing Strategy:**
|
||||
|
||||
**1. Frame Extraction & Sampling**
|
||||
```python
|
||||
def process_video(video_path, fps_sample=5):
|
||||
cap = cv2.VideoCapture(video_path)
|
||||
frame_rate = cap.get(cv2.CAP_PROP_FPS)
|
||||
frame_interval = int(frame_rate / fps_sample) # Sample every N frames
|
||||
|
||||
frames = []
|
||||
frame_count = 0
|
||||
while cap.isOpened():
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
if frame_count % frame_interval == 0:
|
||||
frames.append(frame)
|
||||
frame_count += 1
|
||||
return frames
|
||||
```
|
||||
|
||||
**2. Batch Processing for Efficiency**
|
||||
```python
|
||||
def batch_detect_video(frames, batch_size=8):
|
||||
results = []
|
||||
for i in range(0, len(frames), batch_size):
|
||||
batch = frames[i:i+batch_size]
|
||||
batch_results = model(batch) # Process multiple frames at once
|
||||
results.extend(batch_results)
|
||||
return results
|
||||
```
|
||||
|
||||
**3. Temporal Consistency & Tracking**
|
||||
```python
|
||||
def apply_temporal_tracking(detections, frames):
|
||||
tracker = DeepSORT() # Or ByteTrack for better performance
|
||||
tracked_results = []
|
||||
|
||||
for frame_detections, frame in zip(detections, frames):
|
||||
tracked_objects = tracker.update(frame_detections)
|
||||
tracked_results.append(tracked_objects)
|
||||
|
||||
return tracked_results
|
||||
```
|
||||
|
||||
**4. Optimization Strategies**
|
||||
- **Motion Detection:** Skip frames with no significant changes
|
||||
- **Optical Flow:** Track objects between frames to reduce processing
|
||||
- **Keyframe Selection:** Process only important frames
|
||||
- **Parallel Processing:** Use multiple CPU cores/GPU streams
|
||||
- **Memory Management:** Process in chunks to avoid overflow
|
||||
|
||||
**5. Video-Specific Considerations**
|
||||
- **Temporal Smoothing:** Apply filters to reduce detection jitter
|
||||
- **Performance Scaling:** GPU becomes more critical for video processing
|
||||
- **Storage Requirements:** Annotated videos require significant storage
|
||||
- **Real-time Processing:** Streaming vs batch processing trade-offs
|
||||
|
||||
**Potential API Endpoint:**
|
||||
```python
|
||||
@app.route('/detect/video', methods=['POST'])
|
||||
def detect_video():
|
||||
# Upload video file
|
||||
# Extract frames at specified FPS
|
||||
# Batch process frames with YOLOv8
|
||||
# Apply temporal tracking for consistency
|
||||
# Return annotated video or frame-by-frame results
|
||||
```
|
||||
|
||||
## **Technical Questions Summary**
|
||||
|
||||
The project successfully addresses all required technical questions:
|
||||
|
||||
1. **✅ Algorithm Choice:** YOLOv8 Nano selected for optimal balance of accuracy (99.5% mAP50), speed (37ms), and deployment flexibility
|
||||
2. **✅ Hardware Considerations:** Comprehensive CPU/GPU analysis with auto-detection and fallback strategies for maximum compatibility
|
||||
3. **✅ Video Processing:** Complete video processing strategy with frame extraction, batch processing, temporal tracking, and optimization techniques
|
||||
|
||||
All technical decisions are implemented and validated in the working system.
|
||||
|
||||
## Installation & Setup
|
||||
|
||||
### Prerequisites
|
||||
- Python 3.8+
|
||||
- pip or conda
|
||||
|
||||
### Step-by-Step Installation
|
||||
|
||||
1. **Clone/Download the project**
|
||||
```bash
|
||||
cd ds_task_recycling_project
|
||||
```
|
||||
|
||||
2. **Install dependencies**
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. **Prepare dataset (if not already done)**
|
||||
```bash
|
||||
python3 prepare_dataset.py
|
||||
```
|
||||
|
||||
4. **Train the model**
|
||||
```bash
|
||||
# Basic training (recommended)
|
||||
python3 train.py
|
||||
|
||||
# Custom training parameters
|
||||
python3 train.py --epochs 150 --batch 8 --device cuda
|
||||
```
|
||||
|
||||
5. **Start the Flask API**
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
|
||||
The API will be available at `http://localhost:5000`
|
||||
|
||||
## 🌐 Web Interface for QA Testing
|
||||
|
||||
We've included a comprehensive web interface for easy QA testing:
|
||||
|
||||
### Features:
|
||||
- **Drag & Drop Image Upload** - Easy image selection
|
||||
- **Real-time API Status** - Shows if API and model are loaded
|
||||
- **Multiple Test Options:**
|
||||
- Test hardcoded image
|
||||
- Upload custom images
|
||||
- Run comprehensive API tests
|
||||
- **Interactive Results** - View annotated images with detection details
|
||||
- **Confidence Threshold Control** - Adjust detection sensitivity
|
||||
- **Responsive Design** - Works on desktop and mobile
|
||||
|
||||
### Access:
|
||||
1. Start the API: `python3 main.py`
|
||||
2. Open browser: `http://localhost:5000`
|
||||
3. Use the interface to test detection functionality
|
||||
|
||||
### QA Testing Workflow:
|
||||
1. **Check API Status** - Verify green "API Online" indicator
|
||||
2. **Test Hardcoded Image** - Click "Test Hardcoded Image" button
|
||||
3. **Upload Custom Images** - Drag/drop or select motherboard images
|
||||
4. **Adjust Confidence** - Use slider to test different thresholds
|
||||
5. **Run All Tests** - Comprehensive API endpoint testing
|
||||
6. **Review Results** - Check detection accuracy and annotations
|
||||
|
||||
## 📡 API Documentation
|
||||
|
||||
### Base URL
|
||||
```
|
||||
http://localhost:5000
|
||||
```
|
||||
## API Implementation
|
||||
|
||||
### Endpoints
|
||||
|
||||
#### 1. **GET /** - API Information
|
||||
1. **Image Upload (`/detect`):**
|
||||
```http
|
||||
POST /detect
|
||||
Content-Type: multipart/form-data
|
||||
```
|
||||
- Accepts image uploads
|
||||
- Returns annotated image with detection boxes
|
||||
|
||||
2. **Test Detection (`/detect/test`):**
|
||||
```http
|
||||
GET /detect/test
|
||||
```
|
||||
- Uses a hardcoded test image (memory_out19.png)
|
||||
- Returns annotated image with detection boxes
|
||||
|
||||
### Processing Workflow
|
||||
|
||||
1. Image Reception:
|
||||
- Via file upload or hardcoded test image
|
||||
2. Detection:
|
||||
- YOLOv8 processes the image
|
||||
- Confidence threshold: 0.25
|
||||
- IoU threshold: 0.45
|
||||
3. Annotation:
|
||||
- Bounding boxes drawn around detected modules
|
||||
4. Response:
|
||||
- Annotated image returned in PNG format
|
||||
|
||||
## Model Training
|
||||
|
||||
The model was trained with the following parameters:
|
||||
- 50 epochs
|
||||
- Image size: 640x640
|
||||
- Batch size: 8
|
||||
- Early stopping patience: 15
|
||||
- Augmentations:
|
||||
- Rotation (±5°)
|
||||
- Scale (0.5)
|
||||
- Translation (0.1)
|
||||
- Horizontal flip (0.5)
|
||||
- Mosaic (1.0)
|
||||
|
||||
## Dataset Preparation
|
||||
|
||||
```bash
|
||||
curl http://localhost:5000/
|
||||
training/
|
||||
├── memory/
|
||||
│ └── (images with memory modules) #You have this
|
||||
├── no_memory/
|
||||
│ └── (images without memory modules) #You have this as well
|
||||
├── train/
|
||||
│ ├── images/
|
||||
│ │ ├── memory_*.png
|
||||
│ │ └── no_memory_*.png
|
||||
│ └── labels/
|
||||
│ ├── memory_*.txt
|
||||
│ └── no_memory_*.txt
|
||||
└── val/
|
||||
├── images/
|
||||
│ ├── memory_*.png
|
||||
│ └── no_memory_*.png
|
||||
└── labels/
|
||||
├── memory_*.txt
|
||||
└── no_memory_*.txt
|
||||
|
||||
dataset.yaml
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"message": "Memory Module Detection API",
|
||||
"version": "1.0.0",
|
||||
"endpoints": {...},
|
||||
"model_loaded": true,
|
||||
"supported_formats": ["png", "jpg", "jpeg", "gif", "bmp"]
|
||||
}
|
||||
The dataset is organized as follows:
|
||||
- `training/memory/`: Source directory for images with memory modules
|
||||
- `training/no_memory/`: Source directory for images without memory modules
|
||||
- `training/train/`: Training dataset
|
||||
- `images/`: Contains both memory and no-memory images with appropriate prefixes
|
||||
- `labels/`: Contains YOLO format annotation files
|
||||
- `training/val/`: Validation dataset
|
||||
- `images/`: Contains both memory and no-memory images with appropriate prefixes
|
||||
- `labels/`: Contains YOLO format annotation files
|
||||
|
||||
The `dataset.yaml` file contains:
|
||||
```yaml
|
||||
path: training # dataset root dir
|
||||
train: train/images # train images
|
||||
val: val/images # validation images
|
||||
nc: 1 # number of classes
|
||||
names: ['memory_module'] # class names
|
||||
```
|
||||
|
||||
#### 2. **GET /health** - Health Check
|
||||
## Getting Started
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
git clone http://23.29.118.76:3000/michael/ds_task_recycling_project.git
|
||||
```
|
||||
2. Install dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
3. Prepare the dataset:
|
||||
```bash
|
||||
python prepare_dataset.py
|
||||
```
|
||||
4. Train the model (if not already trained):
|
||||
```bash
|
||||
python train.py
|
||||
```
|
||||
5. Run the Flask application:
|
||||
```bash
|
||||
python run.py
|
||||
```
|
||||
6. Access the web interface at `http://localhost:5000`
|
||||
|
||||
## Testing
|
||||
|
||||
The project includes comprehensive tests for the detector:
|
||||
- Batch detection testing
|
||||
- Threshold optimization
|
||||
- Various confidence/IoU threshold combinations
|
||||
|
||||
Run tests with:
|
||||
```bash
|
||||
curl http://localhost:5000/health
|
||||
pytest tests/
|
||||
```
|
||||
|
||||
#### 3. **POST /detect** - Upload Image Detection
|
||||
```bash
|
||||
curl -X POST -F "image=@motherboard.png" -F "confidence=0.5" http://localhost:5000/detect
|
||||
```
|
||||
## Future Improvements
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"detections": [
|
||||
{
|
||||
"bbox": [100, 150, 200, 250],
|
||||
"confidence": 0.85,
|
||||
"class": 0,
|
||||
"class_name": "memory_module"
|
||||
}
|
||||
],
|
||||
"num_detections": 1,
|
||||
"annotated_image": "base64_encoded_image...",
|
||||
"confidence_threshold": 0.5
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. **GET /detect/hardcoded** - Test with Hardcoded Image
|
||||
```bash
|
||||
curl "http://localhost:5000/detect/hardcoded?confidence=0.5"
|
||||
```
|
||||
|
||||
#### 5. **POST /detect/base64** - Base64 Image Detection
|
||||
```bash
|
||||
curl -X POST -H "Content-Type: application/json" \
|
||||
-d '{"image": "base64_string", "confidence": 0.5}' \
|
||||
http://localhost:5000/detect/base64
|
||||
```
|
||||
|
||||
## 🧪 Testing & Usage Examples
|
||||
|
||||
### 1. **Test with Python requests**
|
||||
```python
|
||||
import requests
|
||||
import base64
|
||||
|
||||
# Test hardcoded image
|
||||
response = requests.get('http://localhost:5000/detect/hardcoded')
|
||||
result = response.json()
|
||||
print(f"Found {result['num_detections']} memory modules")
|
||||
|
||||
# Upload image
|
||||
with open('test_image.png', 'rb') as f:
|
||||
files = {'image': f}
|
||||
response = requests.post('http://localhost:5000/detect', files=files)
|
||||
result = response.json()
|
||||
```
|
||||
|
||||
### 2. **Test with curl**
|
||||
```bash
|
||||
# Basic detection
|
||||
curl -X POST -F "image=@training/memory/out1.png" http://localhost:5000/detect
|
||||
|
||||
# With custom confidence
|
||||
curl -X POST -F "image=@training/memory/out1.png" -F "confidence=0.3" http://localhost:5000/detect
|
||||
```
|
||||
|
||||
### 3. **Command Line Inference**
|
||||
```bash
|
||||
# Test single image
|
||||
python3 inference_utils.py --image training/memory/out1.png --conf 0.5
|
||||
|
||||
# Validate trained model
|
||||
python3 train.py --validate --model runs/detect/memory_module_detection/weights/best.pt
|
||||
```
|
||||
|
||||
## 📊 Training Details
|
||||
|
||||
### Dataset Statistics
|
||||
- **Total Images:** 40 (20 with memory, 20 without)
|
||||
- **Training Split:** 32 images (80%)
|
||||
- **Validation Split:** 8 images (20%)
|
||||
- **Classes:** 1 (memory_module)
|
||||
- **Annotation Format:** YOLO (normalized coordinates)
|
||||
|
||||
### Training Configuration
|
||||
```python
|
||||
# Default training parameters
|
||||
epochs = 100
|
||||
batch_size = 16
|
||||
image_size = 640
|
||||
confidence_threshold = 0.5
|
||||
iou_threshold = 0.45
|
||||
```
|
||||
|
||||
### Expected Training Time
|
||||
- **GPU (RTX 3060+):** 5-10 minutes
|
||||
- **CPU (Modern):** 30-60 minutes
|
||||
- **Memory Usage:** 2-4GB RAM
|
||||
|
||||
### Model Performance
|
||||
After training, you should see:
|
||||
- **mAP50:** >0.8 (80%+ accuracy at 50% IoU)
|
||||
- **Precision:** >0.85
|
||||
- **Recall:** >0.80
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### 1. **Model Not Found Error**
|
||||
```
|
||||
Error: Model not found at runs/detect/memory_module_detection/weights/best.pt
|
||||
```
|
||||
**Solution:** Train the model first
|
||||
```bash
|
||||
python3 train.py
|
||||
```
|
||||
|
||||
#### 2. **CUDA Out of Memory**
|
||||
```
|
||||
RuntimeError: CUDA out of memory
|
||||
```
|
||||
**Solutions:**
|
||||
- Reduce batch size: `python3 train.py --batch 8`
|
||||
- Use CPU: `python3 train.py --device cpu`
|
||||
- Close other GPU applications
|
||||
|
||||
#### 3. **Import Error: ultralytics**
|
||||
```
|
||||
ModuleNotFoundError: No module named 'ultralytics'
|
||||
```
|
||||
**Solution:**
|
||||
```bash
|
||||
pip install ultralytics
|
||||
```
|
||||
|
||||
#### 4. **Flask Port Already in Use**
|
||||
```
|
||||
OSError: [Errno 48] Address already in use
|
||||
```
|
||||
**Solution:**
|
||||
```bash
|
||||
# Kill process using port 5000
|
||||
lsof -ti:5000 | xargs kill -9
|
||||
|
||||
# Or use different port
|
||||
python3 main.py # Edit main.py to change port
|
||||
```
|
||||
|
||||
#### 5. **Low Detection Accuracy**
|
||||
**Solutions:**
|
||||
- Increase training epochs: `python3 train.py --epochs 200`
|
||||
- Lower confidence threshold: `confidence=0.3`
|
||||
- Check image quality and lighting
|
||||
- Verify annotations are correct
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
#### For Better Accuracy:
|
||||
1. **More Training Data:** Add more annotated images
|
||||
2. **Data Augmentation:** Already included in YOLOv8
|
||||
3. **Hyperparameter Tuning:** Adjust learning rate, batch size
|
||||
4. **Model Size:** Use YOLOv8s or YOLOv8m for better accuracy
|
||||
|
||||
#### For Faster Inference:
|
||||
1. **Model Quantization:** Convert to TensorRT or ONNX
|
||||
2. **Batch Processing:** Process multiple images together
|
||||
3. **Image Resizing:** Use smaller input size (320x320)
|
||||
|
||||
## 📁 File Descriptions
|
||||
|
||||
- **`main.py`** - Flask API with all endpoints
|
||||
- **`train.py`** - YOLOv8 training script with validation
|
||||
- **`inference_utils.py`** - Detection utilities and visualization
|
||||
- **`prepare_dataset.py`** - Dataset preparation and splitting
|
||||
- **`requirements.txt`** - Python dependencies
|
||||
- **`dataset.yaml`** - YOLO dataset configuration
|
||||
|
||||
## 🔮 Future Enhancements
|
||||
|
||||
1. **Video Processing:** Add video upload and processing endpoints
|
||||
2. **Model Ensemble:** Combine multiple models for better accuracy
|
||||
3. **Real-time Streaming:** WebSocket support for live camera feeds
|
||||
4. **Database Integration:** Store detection results and statistics
|
||||
5. **Web Interface:** HTML frontend for easier testing
|
||||
6. **Docker Deployment:** Containerized deployment
|
||||
7. **Model Versioning:** Support multiple model versions
|
||||
8. **Batch Processing:** Process multiple images simultaneously
|
||||
|
||||
## 📄 License
|
||||
|
||||
This project is for educational and training purposes.
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
This is a toy project for training purposes. Feel free to experiment and improve!
|
||||
1. GPU support for faster processing
|
||||
2. Video input support
|
||||
3. Real-time streaming capabilities
|
||||
4. More sophisticated augmentation techniques
|
||||
5. Model quantization for improved CPU performance
|
||||
|
||||
Reference in New Issue
Block a user