Files
DS_TASK_AI_VIEWS/backend/config.py
T
Aherobo Ovie Victor beed04d05c feat: Complete all 4 major optimization tasks
 Network & Model Optimization:
- Fixed Sentence Transformers path to use local model
- Configured real semantic embeddings (384-dimensional)
- Replaced hash-based fallback with AI-powered similarity

 Advanced AI Features Integration:
- Added ai_analyzer.py with Groq LLM integration
- Implemented article summarization, sentiment analysis, keyword extraction
- Added AI endpoints: /analyze-article, /generate-insights, /ai-status

 API Enhancement & User Experience:
- Enhanced articles endpoint with pagination (offset/limit, metadata)
- Added advanced filtering (date ranges, source, category)
- Improved search with semantic similarity + multi-parameter filters

 Production Polish & Performance:
- Implemented in-memory caching system in vector_store.py
- Added rate limiting (100 req/min per IP)
- Enhanced API documentation with deployment guide
- Fixed file structure compliance

System now production-ready with 1000+ articles indexed and full AI capabilities.
2025-07-08 16:45:38 +01:00

47 lines
1.5 KiB
Python

"""Configuration settings for DS Task AI News"""
import os
from typing import List
from pydantic_settings import BaseSettings
from dotenv import load_dotenv
load_dotenv()
class Settings(BaseSettings):
# API Keys
cohere_api_key: str = os.getenv("COHERE_API_KEY", "")
groq_api_key: str = os.getenv("GROQ_API_KEY", "")
# Vector Database
vector_db_type: str = os.getenv("VECTOR_DB_TYPE", "faiss")
vector_dimension: int = int(os.getenv("VECTOR_DIMENSION", "384"))
# RSS Feeds
@property
def rss_feeds(self) -> List[str]:
feeds_str = os.getenv(
"RSS_FEEDS",
"https://feeds.bbci.co.uk/news/technology/rss.xml,"
"https://techcrunch.com/feed/,"
"https://www.wired.com/feed/rss"
)
return [feed.strip() for feed in feeds_str.split(",") if feed.strip()]
# Server Settings
host: str = os.getenv("HOST", "0.0.0.0")
port: int = int(os.getenv("PORT", "8000"))
debug: bool = os.getenv("DEBUG", "true").lower() == "true"
# Data Storage (paths relative to project root)
raw_news_dir: str = os.getenv("RAW_NEWS_DIR", "../data/raw_news")
processed_news_dir: str = os.getenv("PROCESSED_NEWS_DIR", "../data/processed_news")
vector_index_path: str = os.getenv("VECTOR_INDEX_PATH", "../data/news_vectors.faiss")
# Embedding Model (Local)
embedding_model: str = "./models/all-MiniLM-L6-v2"
# News Processing
max_articles_per_feed: int = 50
similarity_threshold: float = 0.7
settings = Settings()