Commit Graph

4 Commits

Author SHA1 Message Date
bolade 3842171549 Update .gitignore to exclude preprocessor directory; refactor find_similar_investors function to improve similarity scoring based on investor characteristics and add limit parameter for results. 2025-10-01 23:29:29 +01:00
bolade abac19c6ae Update .gitignore to exclude __pycache__ directories and modify schemas to allow optional fields for better flexibility; adjust batch size in InvestorProcessor for improved processing efficiency. 2025-09-26 15:56:29 +01:00
bolade 65b5df3a43 Add CompanyTable model and refactor query handling; update requirements for new dependencies 2025-09-02 12:22:50 +01:00
bolade bbf6af58f0 Implement LLM-powered Investor Parser with CSV processing, SQL and vector database integration
- Added FastAPI application with a simple root endpoint.
- Developed LLMInvestorParser class for processing investor data from CSV files.
- Integrated OpenAI API for LLM enhancements and JSON cleaning.
- Implemented structured data extraction and saving to SQL database.
- Added functionality to save investor descriptions to ChromaDB for vector similarity search.
- Created command-line interface for processing files and searching investors.
- Added schema definitions for Investor and related data models using SQLAlchemy and Pydantic.
- Implemented logging for better traceability and error handling.
- Included requirements.txt for dependency management.
2025-08-28 22:51:58 +01:00