feat: Implement complete RSS news fetching system with multi-source support
This commit is contained in:
@@ -0,0 +1,430 @@
|
||||
# DS Task AI News - API Documentation
|
||||
|
||||
## Base URL
|
||||
```
|
||||
http://localhost:8000
|
||||
```
|
||||
|
||||
## Authentication
|
||||
Currently, no authentication is required. In production, consider implementing API keys or OAuth.
|
||||
|
||||
## Response Format
|
||||
All API responses follow this structure:
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"message": "Optional message",
|
||||
"data": {},
|
||||
"count": 0
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
Error responses include:
|
||||
```json
|
||||
{
|
||||
"detail": "Error description",
|
||||
"status_code": 400
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Endpoints
|
||||
|
||||
### 1. Health Check
|
||||
|
||||
**GET** `/`
|
||||
|
||||
Check if the API is running.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"message": "DS Task AI News API is running!",
|
||||
"version": "1.0.0",
|
||||
"status": "healthy"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Detailed Health Check
|
||||
|
||||
**GET** `/health`
|
||||
|
||||
Get detailed system status and statistics.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"vector_store": {
|
||||
"total_articles": 150,
|
||||
"index_dimension": 384,
|
||||
"index_exists": true,
|
||||
"last_updated": "2025-07-07T16:00:00"
|
||||
},
|
||||
"settings": {
|
||||
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
|
||||
"vector_db_type": "faiss",
|
||||
"rss_feeds_count": 3
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Fetch News
|
||||
|
||||
**POST** `/fetch-news`
|
||||
|
||||
Fetch news from configured RSS feeds and add to vector store.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"message": "News fetched and processed successfully",
|
||||
"articles_fetched": 45,
|
||||
"articles_stored": 45,
|
||||
"total_articles": 195
|
||||
}
|
||||
```
|
||||
|
||||
**Error Response:**
|
||||
```json
|
||||
{
|
||||
"detail": "Error fetching news: Connection timeout"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. Get Recommendations by Article ID
|
||||
|
||||
**GET** `/recommend-news`
|
||||
|
||||
Get similar articles based on an existing article ID.
|
||||
|
||||
**Parameters:**
|
||||
- `article_id` (required): ID of the reference article
|
||||
- `top_k` (optional, default=5): Number of recommendations
|
||||
|
||||
**Example:**
|
||||
```
|
||||
GET /recommend-news?article_id=abc123&top_k=10
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"article_id": "abc123",
|
||||
"recommendations": [
|
||||
{
|
||||
"id": "def456",
|
||||
"title": "AI Breakthrough in Healthcare",
|
||||
"content": "Recent developments in artificial intelligence...",
|
||||
"url": "https://example.com/article",
|
||||
"source": "TechNews",
|
||||
"published_date": "2025-07-07T10:00:00",
|
||||
"similarity_score": 0.89
|
||||
}
|
||||
],
|
||||
"count": 1
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. Get Recommendations by Query
|
||||
|
||||
**POST** `/recommend-by-query`
|
||||
|
||||
Get article recommendations based on a text query.
|
||||
|
||||
**Request Body:**
|
||||
```json
|
||||
{
|
||||
"query": "artificial intelligence healthcare",
|
||||
"top_k": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"query": "artificial intelligence healthcare",
|
||||
"recommendations": [
|
||||
{
|
||||
"id": "xyz789",
|
||||
"title": "AI Transforms Medical Diagnosis",
|
||||
"content": "Machine learning algorithms are revolutionizing...",
|
||||
"url": "https://example.com/ai-medical",
|
||||
"source": "HealthTech",
|
||||
"published_date": "2025-07-07T14:30:00",
|
||||
"similarity_score": 0.92
|
||||
}
|
||||
],
|
||||
"count": 1
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 6. Get Recommendations by Interests
|
||||
|
||||
**POST** `/recommend-by-interests`
|
||||
|
||||
Get recommendations based on user interests.
|
||||
|
||||
**Request Body:**
|
||||
```json
|
||||
{
|
||||
"interests": ["artificial intelligence", "machine learning", "healthcare"],
|
||||
"top_k": 10
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"interests": ["artificial intelligence", "machine learning", "healthcare"],
|
||||
"recommendations": [...],
|
||||
"count": 8
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 7. Get Trending Articles
|
||||
|
||||
**GET** `/trending`
|
||||
|
||||
Get trending (most recent) articles.
|
||||
|
||||
**Parameters:**
|
||||
- `top_k` (optional, default=10): Number of articles to return
|
||||
|
||||
**Example:**
|
||||
```
|
||||
GET /trending?top_k=20
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"trending_articles": [
|
||||
{
|
||||
"id": "trend1",
|
||||
"title": "Breaking: New AI Model Released",
|
||||
"content": "A groundbreaking AI model has been announced...",
|
||||
"url": "https://example.com/breaking-ai",
|
||||
"source": "AI Weekly",
|
||||
"published_date": "2025-07-07T16:00:00"
|
||||
}
|
||||
],
|
||||
"count": 1
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 8. Get All Articles
|
||||
|
||||
**GET** `/articles`
|
||||
|
||||
Get all articles with optional filtering.
|
||||
|
||||
**Parameters:**
|
||||
- `source` (optional): Filter by news source
|
||||
- `limit` (optional, default=50): Maximum articles to return
|
||||
|
||||
**Example:**
|
||||
```
|
||||
GET /articles?source=BBC%20News&limit=25
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"articles": [...],
|
||||
"count": 25,
|
||||
"source_filter": "BBC News"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 9. Advanced Search
|
||||
|
||||
**POST** `/search`
|
||||
|
||||
Advanced search with filters.
|
||||
|
||||
**Request Body:**
|
||||
```json
|
||||
{
|
||||
"query": "climate change technology",
|
||||
"source": "BBC News",
|
||||
"top_k": 15
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"query": "climate change technology",
|
||||
"filters": {
|
||||
"source": "BBC News"
|
||||
},
|
||||
"results": [...],
|
||||
"count": 12
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. Get Statistics
|
||||
|
||||
**GET** `/stats`
|
||||
|
||||
Get system statistics and information.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"statistics": {
|
||||
"total_articles": 200,
|
||||
"index_dimension": 384,
|
||||
"index_exists": true,
|
||||
"rss_feeds": [
|
||||
"https://feeds.bbci.co.uk/news/rss.xml",
|
||||
"https://rss.cnn.com/rss/edition.rss"
|
||||
],
|
||||
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 11. Test RSS Feeds
|
||||
|
||||
**GET** `/test-rss`
|
||||
|
||||
Test RSS feed connectivity and parsing.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"url": "https://feeds.bbci.co.uk/news/rss.xml",
|
||||
"title": "BBC News",
|
||||
"entries_count": 32,
|
||||
"success": true,
|
||||
"sample_article": {
|
||||
"title": "Tech Giants Announce AI Partnership",
|
||||
"published": "Mon, 07 Jul 2025 16:00:00 GMT",
|
||||
"link": "https://bbc.com/news/tech-partnership"
|
||||
}
|
||||
}
|
||||
],
|
||||
"timestamp": "2025-07-07T16:15:00"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Interactive Documentation
|
||||
|
||||
FastAPI automatically generates interactive API documentation:
|
||||
|
||||
- **Swagger UI**: http://localhost:8000/docs
|
||||
- **ReDoc**: http://localhost:8000/redoc
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
Currently no rate limiting is implemented. Consider adding rate limiting in production:
|
||||
- Per IP: 100 requests/minute
|
||||
- Per endpoint: Varies based on computational cost
|
||||
|
||||
## CORS
|
||||
|
||||
CORS is enabled for all origins in development. In production, configure specific allowed origins.
|
||||
|
||||
## Error Codes
|
||||
|
||||
- **200**: Success
|
||||
- **400**: Bad Request (invalid parameters)
|
||||
- **404**: Not Found (article ID not found)
|
||||
- **500**: Internal Server Error (system error)
|
||||
|
||||
## Data Models
|
||||
|
||||
### Article Object
|
||||
```json
|
||||
{
|
||||
"id": "string",
|
||||
"title": "string",
|
||||
"content": "string",
|
||||
"url": "string",
|
||||
"source": "string",
|
||||
"published_date": "ISO 8601 datetime",
|
||||
"similarity_score": "float (0-1, only in recommendations)"
|
||||
}
|
||||
```
|
||||
|
||||
### Query Object
|
||||
```json
|
||||
{
|
||||
"query": "string",
|
||||
"top_k": "integer (1-100)"
|
||||
}
|
||||
```
|
||||
|
||||
## SDK Examples
|
||||
|
||||
### Python
|
||||
```python
|
||||
import requests
|
||||
|
||||
# Fetch news
|
||||
response = requests.post("http://localhost:8000/fetch-news")
|
||||
print(response.json())
|
||||
|
||||
# Get recommendations
|
||||
response = requests.post(
|
||||
"http://localhost:8000/recommend-by-query",
|
||||
json={"query": "artificial intelligence", "top_k": 5}
|
||||
)
|
||||
recommendations = response.json()["recommendations"]
|
||||
```
|
||||
|
||||
### JavaScript
|
||||
```javascript
|
||||
// Fetch news
|
||||
fetch('http://localhost:8000/fetch-news', {method: 'POST'})
|
||||
.then(response => response.json())
|
||||
.then(data => console.log(data));
|
||||
|
||||
// Get recommendations
|
||||
fetch('http://localhost:8000/recommend-by-query', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({
|
||||
query: 'artificial intelligence',
|
||||
top_k: 5
|
||||
})
|
||||
})
|
||||
.then(response => response.json())
|
||||
.then(data => console.log(data.recommendations));
|
||||
```
|
||||
Reference in New Issue
Block a user