75a0a3fde7
- Added `get_latest_email_date()` function in `database.py` to retrieve the most recent email date for a given account and folder. - Enhanced `fetch_folder_emails()` in `zoho_client.py` to intelligently determine the start date for fetching emails based on the latest email date in the database. - Introduced `analyze_and_update_threads_async()` for asynchronous analysis of email threads, allowing concurrent processing. - Created a synchronous wrapper `analyze_and_update_threads()` for easier integration. - Updated `fetch_emails()` to support database session and account email parameters. - Added comprehensive documentation in `AI_ANALYSIS_GUIDE.md` detailing the new AI analysis functionality. - Implemented tests for the new features, including `test_fetch_with_db.py`, `test_ai_analysis.py`, and `test_single_analysis.py`. - Added error handling and logging improvements throughout the codebase.
4.7 KiB
4.7 KiB
AI Thread Analysis with Asyncio
This document explains how to use the new async AI analysis functionality for email threads.
Overview
The new functionality adds AI-powered analysis to email threads, determining if they require attention (are "actionable") and generating concise summaries. It uses asyncio to process multiple threads concurrently for better performance.
Key Functions
analyze_and_update_threads()
This is the main function you'll use to analyze threads.
from src.database import analyze_and_update_threads
# Analyze all unanalyzed threads for an account
analyze_and_update_threads(
account_email="user@company.com",
max_concurrent=5,
only_unanalyzed=True
)
# Analyze specific threads
analyze_and_update_threads(
account_email="user@company.com",
thread_ids=[1, 2, 3],
max_concurrent=3
)
Parameters:
account_email: The email account to processthread_ids: Optional list of specific thread IDs to analyzemax_concurrent: Maximum number of concurrent AI analysis tasks (default: 5)only_unanalyzed: If True, only analyze threads that haven't been analyzed yet (default: True)
get_threads_needing_analysis()
Check which threads need analysis:
from src.database import get_threads_needing_analysis, SessionLocal
db = SessionLocal()
threads = get_threads_needing_analysis(db, "user@company.com")
print(f"Found {len(threads)} threads needing analysis")
db.close()
Database Schema Updates
The function updates the following Thread model fields:
actionable: Boolean indicating if the thread requires actionai_summary: Text summary of the thread contentai_confidence: Float (0.0-1.0) confidence scorelast_analyzed_at: Timestamp of when analysis was performed
Complete Workflow Example
Here's a complete workflow from email ingestion to AI analysis:
from src.database import (
SessionLocal,
ingest_emails,
analyze_and_update_threads,
get_threads_requiring_reply
)
# 1. Ingest emails (using your existing email fetching logic)
db = SessionLocal()
try:
# Assuming you have fetched emails from your email provider
emails = [...] # Your email data
ingest_emails(db, "user@company.com", emails)
# 2. Run AI analysis on new threads
analyze_and_update_threads(
account_email="user@company.com",
max_concurrent=5,
only_unanalyzed=True
)
# 3. Get threads that need replies and are actionable
reply_threads = get_threads_requiring_reply(db, "user@company.com")
actionable_threads = [t for t in reply_threads if t.actionable]
print(f"Found {len(actionable_threads)} actionable threads requiring replies")
finally:
db.close()
AI Analysis Details
The AI analysis:
- Uses the Groq API if
GROQ_API_KEYenvironment variable is set - Falls back to heuristic analysis if Groq is unavailable
- Analyzes the last 4 messages in each thread by default
- Generates summaries of ≤80 words
- Identifies questions, requests, and actionable items
- Ignores automated/newsletter emails
Performance
- Uses asyncio for concurrent processing
- Configurable concurrency limit (default: 5 concurrent analyses)
- AI analysis runs in thread pool to avoid blocking
- Efficient database operations with single commit per batch
Error Handling
- Gracefully handles individual thread analysis failures
- Continues processing other threads if one fails
- Provides detailed error logging
- Automatically rolls back database changes on failure
Usage Tips
- Start with small batches: Use
max_concurrent=3initially to avoid overwhelming the AI service - Regular analysis: Run analysis after each email ingestion cycle
- Focus on actionable threads: Prioritize threads that are both
requires_reply=Trueandactionable=True - Monitor confidence scores: Lower confidence may indicate uncertain analysis
- Environment setup: Set
GROQ_API_KEYfor better AI analysis quality
Testing
Use the provided test scripts:
# Test the complete workflow
python3 example_workflow.py
# Test single thread analysis
python3 test_single_analysis.py
# Reset analysis data for testing
python3 reset_analysis.py
Integration with Existing Code
To integrate with your existing email processing:
# After your existing email ingestion
from src.database import analyze_and_update_threads
def process_emails(account_email: str):
# Your existing email fetching and ingestion code
fetch_and_ingest_emails(account_email)
# Add AI analysis
analyze_and_update_threads(
account_email=account_email,
only_unanalyzed=True
)
This ensures that new threads are automatically analyzed for actionability after each email sync.