The Matching Problem No One Talks About

Stripe records a charge on Monday at 23:47 UTC. The bank settles it on Tuesday. PayPal timestamps its version to the payer's local timezone. The three records describe one event, but no two of them agree on the date.

This is the gap that every reconciliation system has to cross. The dates are correct within each source's frame of reference. They are also irreconcilably different when compared side by side. Any engine that uses strict equality matching will generate a false discrepancy for every cross-timezone settlement delay.

Why Exact Matching Breaks Under Load

The first version of the matcher was a lookup table. Hash on (amount, currency, date). If the hash matched, call it reconciled. It worked perfectly on synthetic test data where everything happened on the same day.

In production data from a single merchant processing through Stripe, 23% of bank settlement dates differed from charge dates by exactly one calendar day. Another 4% differed by two days because of weekend processing delays. The lookup table called all of them unmatched.

The finance team's response: manually reconcile anything the engine missed. Which defeated the purpose of building the engine.

Scoring Instead of Searching

The replacement design treats matching as a ranking problem. Every gateway transaction is evaluated against every unmatched ledger transaction. Each evaluation runs through four independent rules that each return a confidence score:

Exact Match (1.00): amount, currency, date, and counterparty all match. The counterparty comparison normalizes to lowercase and trims whitespace. If both counterparty fields are empty, the rule skips that check rather than rejecting the pair. This is deliberate: bank statements frequently omit counterparty names for internal transfers.

Amount + Date (0.90): amount and currency match exactly, but the date is within a configurable tolerance window. The default is 3 days. This catches settlement delays without opening the door to false positives from unrelated transactions with identical amounts.

Reference Match (0.80): one transaction's external_id appears as a substring of the other's description field. This catches the pattern where a bank statement's narrative reads "STRIPE CHARGE ch_3N7kR..." and the Stripe record has external_id: ch_3N7kR.... Both values are lowercased before comparison.

Fuzzy Amount (0.75): amounts are within a percentage tolerance of each other. This handles fee variance because a payment gateway may deduct processing fees before reporting the net amount, while the bank records the gross. The tolerance is configurable (default: 2%).

The scorer evaluates all four rules against each pair and takes the highest confidence. Below a minimum threshold (default: 0.70), the pair is discarded.

The core of the scorer in Go:

// ScoreAll evaluates a source transaction against all candidates
// and returns matches sorted by confidence (highest first).
func (s *Scorer) ScoreAll(source domain.Transaction, candidates []domain.Transaction) []ScoredMatch {
    var results []ScoredMatch
    for _, cand := range candidates {
        mc := MatchCandidate{Source: source, Target: cand}
        best := ScoredMatch{}
        for _, rule := range s.rules {
            conf := rule.Score(mc)
            if conf > best.Confidence {
                best = ScoredMatch{
                    GatewayTxID: source.ID,
                    LedgerTxID:  cand.ID,
                    Confidence:  conf,
                    MatchRule:   rule.Name(),
                }
            }
        }
        if best.Confidence >= s.minConfidence {
            results = append(results, best)
        }
    }
    sort.Slice(results, func(i, j int) bool {
        return results[i].Confidence > results[j].Confidence
    })
    return results
}

The MatchCandidate struct is a value type. It sits on the stack for every iteration of the inner loop. The garbage collector never sees it.

What Made This Hard

The O(n x m) evaluation creates a combinatorial problem. For a merchant with 5,000 gateway transactions and 5,000 ledger transactions in a window, that is 25 million pair evaluations. Each evaluation runs four rule checks.

Three decisions keep this tractable:

Value types in the inner loop. The MatchCandidate struct is allocated on the stack, not the heap. No pointer indirection. The garbage collector never sees the scoring artifacts.
Early exit per rule. Each rule checks currency first. If currencies differ, it returns immediately without evaluating amount or date. Currency mismatches are the most common short-circuit. A merchant processing in both EUR and USD filters out half the candidate pool in the first comparison.
Map-based exclusion. Once a transaction is matched, it is added to a map[string]bool and excluded from future candidate lists. The effective search space shrinks linearly as matches are confirmed. In practice, high-confidence matches exit the pool early, leaving only genuinely ambiguous pairs for the lower-confidence rules.

The Dedup Invariant

The ingester guarantees at-most-once persistence through a two-tier dedup check. The dedup key is sha256(source_id + external_id).

Redis GET dedup:{key} checks first. If the key exists, the transaction is a known duplicate and never reaches PostgreSQL. If Redis is unavailable (connection error, timeout, cold restart), the check falls through to a SELECT ... WHERE dedup_key = ? against the transactions table. If that also returns empty, the insert uses ON CONFLICT DO NOTHING on the dedup_key column, catching the race condition where two goroutines pass the dedup check simultaneously.

This design means Redis is a performance optimization, not a correctness requirement. The system is correct without it. It is just slower.

What Changes When the Matcher Meets Bank File Formats

CAMT.053 is the ISO 20022 standard for bank statement reporting across European banks. It uses XML with deeply nested structures. The amount sits in an Amt element with a Ccy attribute. The direction is encoded as CRDT or DBIT in the CdtDbtInd field. The counterparty is nested inside RltdPties, where the adapter picks Dbtr.Nm for credits and Cdtr.Nm for debits.

The adapter parses this into the same canonical IngestRequest that Stripe and PayPal produce. From the engine's perspective, a CAMT.053 entry and a Stripe balance transaction are identical: same struct, same fields, same confidence-scored matching. The format complexity is absorbed entirely at the adapter boundary.

Why Exact Matching Breaks Under Load

Scoring Instead of Searching

What Made This Hard

The Dedup Invariant

What Changes When the Matcher Meets Bank File Formats

Put this system in context.

Contents