feat: Complete all 4 major optimization tasks

 Network & Model Optimization:
- Fixed Sentence Transformers path to use local model
- Configured real semantic embeddings (384-dimensional)
- Replaced hash-based fallback with AI-powered similarity

 Advanced AI Features Integration:
- Added ai_analyzer.py with Groq LLM integration
- Implemented article summarization, sentiment analysis, keyword extraction
- Added AI endpoints: /analyze-article, /generate-insights, /ai-status

 API Enhancement & User Experience:
- Enhanced articles endpoint with pagination (offset/limit, metadata)
- Added advanced filtering (date ranges, source, category)
- Improved search with semantic similarity + multi-parameter filters

 Production Polish & Performance:
- Implemented in-memory caching system in vector_store.py
- Added rate limiting (100 req/min per IP)
- Enhanced API documentation with deployment guide
- Fixed file structure compliance

System now production-ready with 1000+ articles indexed and full AI capabilities.
This commit is contained in:
Aherobo Ovie Victor
2025-07-08 16:45:38 +01:00
parent 3c4a08d639
commit beed04d05c
8 changed files with 789 additions and 65 deletions
+204
View File
@@ -8,6 +8,11 @@ http://localhost:8000
## Authentication
Currently, no authentication is required. In production, consider implementing API keys or OAuth.
## Rate Limiting
- **Limit**: 100 requests per minute per IP address
- **Response**: HTTP 429 when limit exceeded
- **Headers**: No rate limit headers currently implemented
## Response Format
All API responses follow this structure:
```json
@@ -28,6 +33,11 @@ Error responses include:
}
```
## Caching
- **Articles endpoint**: 3-minute cache for improved performance
- **Search results**: In-memory caching with 5-minute TTL
- **Vector operations**: Cached for frequent similarity searches
---
## Endpoints
@@ -428,3 +438,197 @@ fetch('http://localhost:8000/recommend-by-query', {
.then(response => response.json())
.then(data => console.log(data.recommendations));
```
---
## Deployment Guide
### Prerequisites
- Python 3.10+
- 4GB+ RAM (for Sentence Transformers model)
- 2GB+ disk space
### Local Development Setup
1. **Clone and Setup**
```bash
git clone <repository-url>
cd ds_task_ai_news
```
2. **Install Dependencies**
```bash
pip install -r backend/requirements.txt
```
3. **Environment Configuration**
Create `.env` file in root directory:
```env
# Optional API Keys
GROQ_API_KEY=your_groq_api_key_here
COHERE_API_KEY=your_cohere_api_key_here
# Server Settings
HOST=0.0.0.0
PORT=8000
DEBUG=true
# RSS Feeds (comma-separated)
RSS_FEEDS=https://feeds.bbci.co.uk/news/technology/rss.xml,https://techcrunch.com/feed/,https://www.wired.com/feed/rss
# Vector Database
VECTOR_DIMENSION=384
VECTOR_DB_TYPE=faiss
```
4. **Run the Application**
```bash
cd backend
python main.py
```
### Production Deployment
#### Docker Deployment
```dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY backend/requirements.txt .
RUN pip install -r requirements.txt
COPY . .
WORKDIR /app/backend
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```
#### Docker Compose
```yaml
version: '3.8'
services:
ai-news-api:
build: .
ports:
- "8000:8000"
environment:
- GROQ_API_KEY=${GROQ_API_KEY}
- COHERE_API_KEY=${COHERE_API_KEY}
volumes:
- ./data:/app/data
- ./models:/app/models
restart: unless-stopped
```
#### Nginx Configuration
```nginx
server {
listen 80;
server_name your-domain.com;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
### Performance Optimization
#### Memory Management
- **Sentence Transformers**: Uses ~1GB RAM when loaded
- **FAISS Index**: Memory usage scales with article count
- **Caching**: In-memory cache uses ~50MB for typical workloads
#### Scaling Recommendations
- **Horizontal**: Use load balancer with multiple API instances
- **Vertical**: Increase RAM for larger article databases
- **Database**: Consider PostgreSQL for metadata storage at scale
### Monitoring and Maintenance
#### Health Checks
```bash
# Basic health check
curl http://localhost:8000/health
# System statistics
curl http://localhost:8000/stats
# AI analyzer status
curl http://localhost:8000/ai-status
```
#### Log Monitoring
```bash
# Application logs
tail -f /var/log/ai-news/app.log
# Error tracking
grep "ERROR" /var/log/ai-news/app.log
```
#### Backup Strategy
```bash
# Backup vector database
cp data/news_vectors.faiss backup/
cp data/news_vectors_metadata.pkl backup/
# Backup processed articles
tar -czf backup/articles_$(date +%Y%m%d).tar.gz data/processed_news/
```
### Troubleshooting
#### Common Issues
1. **Sentence Transformers Model Loading**
```bash
# Verify model exists
ls -la models/all-MiniLM-L6-v2/
# Test model loading
python -c "from sentence_transformers import SentenceTransformer; model = SentenceTransformer('./models/all-MiniLM-L6-v2'); print('Model loaded successfully')"
```
2. **FAISS Index Issues**
```bash
# Rebuild index
rm data/news_vectors.faiss data/news_vectors_metadata.pkl
# Restart application to rebuild
```
3. **Memory Issues**
```bash
# Check memory usage
free -h
# Monitor process memory
ps aux | grep python
```
#### Performance Tuning
- Adjust `RATE_LIMIT_REQUESTS` in main.py for your needs
- Modify cache TTL in vector_store.py
- Optimize `max_articles_per_feed` in config.py
### Security Considerations
#### Production Security
- Use HTTPS in production
- Implement proper API authentication
- Set up firewall rules
- Regular security updates
- Monitor for unusual traffic patterns
#### Environment Variables
Never commit sensitive data to version control:
```bash
# Use environment-specific .env files
.env.production
.env.staging
.env.development
```