08386f8544137745330793ec9089558b7dfd7e7b
AI Bookkeeper - Data Science Engine
AI-powered receipt-to-transaction matching engine using Groq LLM. This is a Data Science Engine that provides intelligent matching capabilities for backend applications.
🎯 Purpose
This Data Science Engine receives QuickBooks transaction data from backend applications and provides:
- AI-powered receipt processing (OCR and data extraction)
- Intelligent receipt-transaction matching with confidence scores
- Google Drive integration for batch receipt processing
- Configurable AI rules for business logic
- Feedback logging for continuous improvement
🚀 Quick Start
1. Install Dependencies
pip install -r requirements.txt
2. Configure API Keys
The Groq API key is already configured in config.py
3. Start the DS Engine
python main.py
4. Access API Documentation
- Swagger UI: http://localhost:8343/docs
- ReDoc: http://localhost:8343/redoc
📋 API Endpoints
QuickBooks Data Import
POST /transactions/import/quickbooks- Import and convert QuickBooks transactions
Receipt Processing
POST /upload- Upload receipt documents (PDF/images)POST /process/{file_id}- Extract data from uploaded documentsGET /documents- List all processed documents
Google Drive Integration
POST /drive/sync- Sync and process receipts from Google DriveGET /drive/folders- List accessible Google Drive foldersGET /drive/folder/{folder_id}- Get folder information
AI Matching Engine
POST /match- Match receipts to transactions using AIPOST /approve- Approve or reject AI matches
AI Rules Management
POST /rules- Add new AI rulesGET /rules- List all active rulesDELETE /rules/{rule_name}- Delete rules
System Monitoring
GET /stats- Get system statistics and performance metrics
🔧 Core Components
AIMatcher (ai_matcher.py)
- Uses Groq LLM to compare receipts and transactions
- Provides confidence scores and reasoning
- Configurable matching criteria (amount, date, vendor)
AIRulesEngine (ai_rules.py)
- Applies business rules for auto-approval and categorization
- Configurable rule conditions and actions
- Supports system and user-generated rules
DocumentProcessor (document_processor.py)
- AI-powered receipt data extraction
- Supports PDF and image formats
- Uses Groq vision model for OCR
MatchingEngine (matching_engine.py)
- Main orchestrator combining all components
- Handles the complete matching workflow
- Provides statistics and feedback logging
FeedbackLogger (feedback_logger.py)
- Tracks manual overrides for AI training
- Maintains audit trail of user decisions
- Enables continuous model improvement
📊 Configuration
Edit config.py to adjust:
- Confidence threshold (default: 0.8)
- Date tolerance days (default: 7)
- Amount tolerance percent (default: 5%)
- Groq API key (already configured)
🔄 Integration Workflow
1. Backend Sends QuickBooks Data
# Backend sends QuickBooks transactions
response = requests.post(
"http://localhost:8343/transactions/import/quickbooks",
json={
"transactions": [
{
"id": "QB_TXN_123",
"txn_date": "2024-01-15",
"amount": 12.50,
"payee_name": "Starbucks",
"memo": "Coffee purchase"
}
]
}
)
2. Process Receipts
# Sync from Google Drive
response = requests.post(
"http://localhost:8343/drive/sync",
json={"folder_id": "your_folder_id"}
)
# Or upload directly
response = requests.post(
"http://localhost:8343/upload",
files={"file": receipt_file}
)
3. AI Matching
# Match receipts to transactions
response = requests.post(
"http://localhost:8343/match",
json={
"receipts": processed_receipts,
"transactions": converted_transactions
}
)
4. User Feedback
# Approve or reject matches
response = requests.post(
"http://localhost:8343/approve",
json={
"match_id": "match_123",
"user_id": "user_456",
"action": "approve"
}
)
🎯 Key Features
- AI-powered matching with confidence scores
- Rule-based auto-approval and categorization
- Feedback logging for continuous improvement
- Configurable matching parameters
- Google Drive integration for batch processing
- JSON API for easy backend integration
- Comprehensive error handling
📝 Data Formats
QuickBooks Transaction Input
{
"id": "string",
"txn_date": "YYYY-MM-DD",
"amount": 0.00,
"payee_name": "string",
"memo": "string (optional)",
"account_name": "string (optional)",
"txn_type": "string (optional)"
}
Match Result Output
{
"receipt_id": "string",
"transaction_id": "string",
"confidence_score": 0.95,
"match_reason": "string",
"receipt_vendor": "string",
"receipt_amount": 0.00,
"transaction_vendor": "string",
"transaction_amount": 0.00
}
🔍 AI Matching Criteria
The engine uses three primary criteria for matching:
- Amount Similarity - Compares receipt and transaction amounts (5% tolerance)
- Date Proximity - Checks date closeness (7-day tolerance)
- Vendor Matching - AI-powered vendor name comparison
🚀 Production Deployment
For production deployment:
- Replace in-memory storage with a database
- Configure proper authentication
- Set up monitoring and logging
- Use environment variables for configuration
- Implement proper error handling and retries
📞 Support
This Data Science Engine is designed to be integrated with backend applications that handle:
- QuickBooks API connections
- User interface and workflows
- Data persistence and management
- External integrations
The engine focuses purely on AI/ML capabilities and provides a clean JSON API for backend integration.
Description
Languages
Python
100%