Update AI matcher to return only the best match with confidence score

This commit is contained in:
Iyeoluwa Akinrinola
2025-07-02 21:39:41 +01:00
parent 58d579700e
commit 6b6633b165
+19 -14
View File
@@ -13,27 +13,30 @@ class AIMatcher:
matches = [] matches = []
for receipt in receipts: for receipt in receipts:
# Get ALL potential matches for this receipt, not just the best one # Get the BEST match for this receipt (highest confidence score)
receipt_matches = self._find_all_matches(receipt, transactions) best_match = self._find_best_match(receipt, transactions)
matches.extend(receipt_matches) if best_match:
matches.append(best_match)
return sorted(matches, key=lambda x: x.confidence_score, reverse=True) return sorted(matches, key=lambda x: x.confidence_score, reverse=True)
def _find_all_matches(self, receipt: Receipt, transactions: List[Transaction]) -> List[Match]: def _find_best_match(self, receipt: Receipt, transactions: List[Transaction]) -> Match:
"""Find ALL potential matches for a receipt, not just the best one""" """Find the BEST match for a receipt (highest confidence score)"""
candidates = self._filter_candidates(receipt, transactions) candidates = self._filter_candidates(receipt, transactions)
if not candidates: if not candidates:
return [] return None
matches = [] best_match = None
highest_score = 0
for transaction in candidates: for transaction in candidates:
score, reason = self._calculate_match_score(receipt, transaction) score, reason = self._calculate_match_score(receipt, transaction)
# Include ALL matches regardless of score - let the user decide # Keep the match with the highest score, regardless of how low it is
match = Match(receipt, transaction, score, reason) if score > highest_score:
matches.append(match) highest_score = score
best_match = Match(receipt, transaction, score, reason)
return matches return best_match
def _filter_candidates(self, receipt: Receipt, transactions: List[Transaction]) -> List[Transaction]: def _filter_candidates(self, receipt: Receipt, transactions: List[Transaction]) -> List[Transaction]:
# Return MOST transactions - let the AI decide on scoring # Return MOST transactions - let the AI decide on scoring
@@ -68,22 +71,24 @@ class AIMatcher:
- Amount difference: ${amount_diff} ({amount_percent_diff:.1f}%) - Amount difference: ${amount_diff} ({amount_percent_diff:.1f}%)
- Vendor comparison: "{receipt.vendor}" vs "{transaction.vendor}" - Vendor comparison: "{receipt.vendor}" vs "{transaction.vendor}"
IMPORTANT: Score ALL potential matches, even imperfect ones. The score should reflect how likely this is a match: Score this potential match based on how likely it is the correct match:
- Perfect matches (same vendor, amount, date): 0.95-1.0 - Perfect matches (same vendor, amount, date): 0.95-1.0
- High confidence (minor differences): 0.8-0.94 - High confidence (minor differences): 0.8-0.94
- Medium confidence (moderate differences): 0.6-0.79 - Medium confidence (moderate differences): 0.6-0.79
- Low confidence (significant differences): 0.4-0.59 - Low confidence (significant differences): 0.4-0.59
- Very low confidence (major differences): 0.2-0.39 - Very low confidence (major differences): 0.2-0.39
- No meaningful similarity: 0.0-0.19 - Minimal similarity: 0.1-0.19
- No meaningful similarity: 0.0-0.09
Examples: Examples:
- Same vendor, same amount, 11 days apart: 0.7-0.8 - Same vendor, same amount, 11 days apart: 0.7-0.8
- Similar vendor name, same amount, same date: 0.8-0.9 - Similar vendor name, same amount, same date: 0.8-0.9
- Same vendor, 10% amount difference, same date: 0.6-0.7 - Same vendor, 10% amount difference, same date: 0.6-0.7
- Different vendor, same amount, same date: 0.3-0.4 - Different vendor, same amount, same date: 0.3-0.4
- Completely different vendor, amount, date: 0.1-0.2
Consider vendor name similarity, amount accuracy, and date proximity. Even imperfect matches should get reasonable scores if there's any meaningful similarity. Consider vendor name similarity, amount accuracy, and date proximity. Score based on overall likelihood this is the correct match.
Return only: score|reason Return only: score|reason
""" """