Files
email_alerts_v2/AI_ANALYSIS_GUIDE.md
T
bolade 75a0a3fde7 feat: Implement async AI analysis for email threads
- Added `get_latest_email_date()` function in `database.py` to retrieve the most recent email date for a given account and folder.
- Enhanced `fetch_folder_emails()` in `zoho_client.py` to intelligently determine the start date for fetching emails based on the latest email date in the database.
- Introduced `analyze_and_update_threads_async()` for asynchronous analysis of email threads, allowing concurrent processing.
- Created a synchronous wrapper `analyze_and_update_threads()` for easier integration.
- Updated `fetch_emails()` to support database session and account email parameters.
- Added comprehensive documentation in `AI_ANALYSIS_GUIDE.md` detailing the new AI analysis functionality.
- Implemented tests for the new features, including `test_fetch_with_db.py`, `test_ai_analysis.py`, and `test_single_analysis.py`.
- Added error handling and logging improvements throughout the codebase.
2025-08-11 23:20:20 +01:00

4.7 KiB

AI Thread Analysis with Asyncio

This document explains how to use the new async AI analysis functionality for email threads.

Overview

The new functionality adds AI-powered analysis to email threads, determining if they require attention (are "actionable") and generating concise summaries. It uses asyncio to process multiple threads concurrently for better performance.

Key Functions

analyze_and_update_threads()

This is the main function you'll use to analyze threads.

from src.database import analyze_and_update_threads

# Analyze all unanalyzed threads for an account
analyze_and_update_threads(
    account_email="user@company.com",
    max_concurrent=5,
    only_unanalyzed=True
)

# Analyze specific threads
analyze_and_update_threads(
    account_email="user@company.com", 
    thread_ids=[1, 2, 3],
    max_concurrent=3
)

Parameters:

  • account_email: The email account to process
  • thread_ids: Optional list of specific thread IDs to analyze
  • max_concurrent: Maximum number of concurrent AI analysis tasks (default: 5)
  • only_unanalyzed: If True, only analyze threads that haven't been analyzed yet (default: True)

get_threads_needing_analysis()

Check which threads need analysis:

from src.database import get_threads_needing_analysis, SessionLocal

db = SessionLocal()
threads = get_threads_needing_analysis(db, "user@company.com")
print(f"Found {len(threads)} threads needing analysis")
db.close()

Database Schema Updates

The function updates the following Thread model fields:

  • actionable: Boolean indicating if the thread requires action
  • ai_summary: Text summary of the thread content
  • ai_confidence: Float (0.0-1.0) confidence score
  • last_analyzed_at: Timestamp of when analysis was performed

Complete Workflow Example

Here's a complete workflow from email ingestion to AI analysis:

from src.database import (
    SessionLocal, 
    ingest_emails,
    analyze_and_update_threads,
    get_threads_requiring_reply
)

# 1. Ingest emails (using your existing email fetching logic)
db = SessionLocal()
try:
    # Assuming you have fetched emails from your email provider
    emails = [...] # Your email data
    ingest_emails(db, "user@company.com", emails)
    
    # 2. Run AI analysis on new threads
    analyze_and_update_threads(
        account_email="user@company.com",
        max_concurrent=5,
        only_unanalyzed=True
    )
    
    # 3. Get threads that need replies and are actionable
    reply_threads = get_threads_requiring_reply(db, "user@company.com")
    actionable_threads = [t for t in reply_threads if t.actionable]
    
    print(f"Found {len(actionable_threads)} actionable threads requiring replies")
    
finally:
    db.close()

AI Analysis Details

The AI analysis:

  • Uses the Groq API if GROQ_API_KEY environment variable is set
  • Falls back to heuristic analysis if Groq is unavailable
  • Analyzes the last 4 messages in each thread by default
  • Generates summaries of ≤80 words
  • Identifies questions, requests, and actionable items
  • Ignores automated/newsletter emails

Performance

  • Uses asyncio for concurrent processing
  • Configurable concurrency limit (default: 5 concurrent analyses)
  • AI analysis runs in thread pool to avoid blocking
  • Efficient database operations with single commit per batch

Error Handling

  • Gracefully handles individual thread analysis failures
  • Continues processing other threads if one fails
  • Provides detailed error logging
  • Automatically rolls back database changes on failure

Usage Tips

  1. Start with small batches: Use max_concurrent=3 initially to avoid overwhelming the AI service
  2. Regular analysis: Run analysis after each email ingestion cycle
  3. Focus on actionable threads: Prioritize threads that are both requires_reply=True and actionable=True
  4. Monitor confidence scores: Lower confidence may indicate uncertain analysis
  5. Environment setup: Set GROQ_API_KEY for better AI analysis quality

Testing

Use the provided test scripts:

# Test the complete workflow
python3 example_workflow.py

# Test single thread analysis
python3 test_single_analysis.py

# Reset analysis data for testing
python3 reset_analysis.py

Integration with Existing Code

To integrate with your existing email processing:

# After your existing email ingestion
from src.database import analyze_and_update_threads

def process_emails(account_email: str):
    # Your existing email fetching and ingestion code
    fetch_and_ingest_emails(account_email)
    
    # Add AI analysis
    analyze_and_update_threads(
        account_email=account_email,
        only_unanalyzed=True
    )

This ensures that new threads are automatically analyzed for actionability after each email sync.