2025-07-24 16:35:04 +01:00
2025-07-08 19:57:35 +01:00
2025-07-24 16:35:04 +01:00
2025-07-07 22:08:02 +01:00

DS Task AI News

Project Overview

DS Task AI News is an AI-powered news retrieval system that gathers news articles from various online sources, stores them in a vector database, and enables users to discover relevant articles based on their interests. The system uses advanced AI techniques to find and recommend related news articles dynamically.

Features

  • News Aggregation : Fetches news using RSS feeds from various online portals.
  • Vector Database Storage : Stores news articles in a vector database for efficient similarity searches.
  • AI-powered Recommendations : Uses Cohere embeddings and re-ranking to provide relevant news recommendations.
  • LLM-powered Analysis : Utilizes Groq for AI-driven insights and processing.

Tech Stack

  • LLM : Groq
  • Search : RSS Feeds for news aggregation
  • Embeddings & Re-Ranking : Cohere
  • Vector Database : (e.g., Pinecone, Weaviate, or FAISS)
  • Backend : FastAPI

File Structure

DS_Task_AI_News/
│-- backend/
│   │-- main.py  # FastAPI backend
│   │-- news_fetcher.py  # Fetches news using RSS feeds
│   │-- vector_store.py  # Handles vector database operations
│   │-- embeddings.py  # Generates embeddings using Cohere
│   │-- recommender.py  # Fetches related news articles
│   │-- config.py  # Configuration settings
│   │-- requirements.txt  # Dependencies
│
│-- data/
│   │-- raw_news/  # Stores raw news articles before processing
│   │-- processed_news/  # Stores cleaned and processed articles
│
│-- docs/
│   │-- README.md  # Documentation for new developers
│   │-- API_Documentation.md  # API details
│
│-- .env  # Environment variables
│-- .gitignore  # Git ignore file
│-- LICENSE  # License information

Setup & Installation

1. Clone the Repository

git clone http://23.29.118.76:3000/Test/ds_task_ai_news
cd ds-task-ai-news

2. Set Up the Backend

cd backend
pip install -r requirements.txt
python main.py

Fetching News Using RSS Feeds

  • News is aggregated from RSS feeds of different news sources.
  • The news_fetcher.py script pulls data from RSS feeds, extracts relevant information, and stores it in the database.

Example RSS Fetching Code (Python)

import feedparser

def fetch_rss_news(feed_url):
    feed = feedparser.parse(feed_url)
    articles = []
    for entry in feed.entries:
        articles.append({
            "title": entry.title,
            "content": entry.summary,
            "date": entry.published,
            "slug": entry.title.lower().replace(" ", "-"),
            "categories": ["Technology", "AI and Innovation"],
            "tags": ["AI", "Technology", "Innovation"]
        })
    return articles

API Endpoints

  • GET /fetch-news: Fetches news from RSS feeds.
  • GET /recommend-news?article_id=xyz: Retrieves similar news based on the selected article.
S
Description
No description provided
Readme 42 KiB
Languages
Python 100%