Implement LLM-powered Investor Parser with CSV processing, SQL and vector database integration

- Added FastAPI application with a simple root endpoint.
- Developed LLMInvestorParser class for processing investor data from CSV files.
- Integrated OpenAI API for LLM enhancements and JSON cleaning.
- Implemented structured data extraction and saving to SQL database.
- Added functionality to save investor descriptions to ChromaDB for vector similarity search.
- Created command-line interface for processing files and searching investors.
- Added schema definitions for Investor and related data models using SQLAlchemy and Pydantic.
- Implemented logging for better traceability and error handling.
- Included requirements.txt for dependency management.
This commit is contained in:
bolade
2025-08-28 22:51:58 +01:00
commit bbf6af58f0
13 changed files with 5227 additions and 0 deletions
+16
View File
@@ -0,0 +1,16 @@
# Core dependencies
pandas>=2.0.0
sqlalchemy>=2.0.0
pydantic>=2.0.0
# Vector database
chromadb>=0.4.0
# LLM integration
openai>=1.0.0
# Environment management
python-dotenv>=1.0.0
# Additional dependencies for data processing
typing-extensions>=4.0.0