A Compliance Audit Trail That Nobody Can Edit
The Situation
A payment processor reports $10,000 received. The internal ledger shows $9,850 expected. That $150 gap between expected and actual transaction amounts needs to be investigated, escalated if it lingers too long, and documented so an auditor can reconstruct the full history months later.
In most organizations, this tracking happens in spreadsheets and email threads. A compliance officer opens a shared document, logs the discrepancy, emails the relevant team, and checks back in a few days. If nothing happens, they escalate by sending another email to someone more senior. The spreadsheet shows the current status. It does not show who changed the status, when, or whether the change was legitimate.
None of that effort answers the question an auditor will ask six months later: can you prove these records weren't modified? A spreadsheet row edited last Thursday doesn't prove it wasn't also edited last Wednesday. A database with updated_at columns has the same weakness: the timestamp itself is a regular writable field.
The Cost of Doing Nothing
A compliance team dedicates at least one full-time role to discrepancy tracking and manual escalation. Status updates, follow-up emails, report compilation, evidence gathering: this consumes roughly 20 hours per week of a compliance specialist's time. At a loaded cost of EUR 50-60 per hour in Western Europe, that's approximately EUR 52,000 per year in labor spent on a process that still can't guarantee the audit trail is clean.
The labor cost is the visible expense. The larger exposure is audit risk. An audit trail maintained in mutable systems (spreadsheets, standard databases, email archives) can't answer the basic compliance question: "prove these records weren't altered." When that question comes from a regulator, the answer is either structural proof or an expensive remediation process. Organizations that discover trail gaps during regulatory review typically spend multiples of the prevention cost on forensic reconstruction, legal consultation, and process overhaul.
Setting up the spreadsheet was free. Trusting it with compliance evidence was not.
What I Built
An event-sourced compliance ledger where every action produces a permanent, append-only record. Discrepancy events arrive from upstream systems via NATS JetStream streaming. Each discrepancy moves through a six-state lifecycle (open, acknowledged, investigating, resolved, escalated, or auto-closed) and every transition is recorded as an immutable event with a monotonic sequence number. No UPDATE. No DELETE. The event table only supports INSERT.
The genuinely hard part was the escalation engine. Escalation rules are tenant-scoped database rows, not hardcoded logic. Each tenant configures their own policies: "if a high-severity discrepancy stays open for 4 hours, notify the compliance manager via the notification hub. If it stays open for 24 hours, escalate and upgrade the severity to critical." The engine evaluates these rules every 15 minutes and takes the configured action automatically. Getting concurrent rule evaluation to produce a coherent event trail took two redesigns of the transaction logic.
Both outbound integrations (the Notification Hub for alert delivery and the RAG Platform for AI-powered compliance queries) are feature-flagged and default to disabled. The ledger works standalone. When either downstream service goes offline, discrepancies still get recorded and escalation still fires. Nothing blocks the audit trail.
System Flow
Data Model
Architecture Layers
The Decision Log
| Decision | Alternative Rejected | Why |
|---|---|---|
| Append-only event table (no UPDATE, no DELETE) | Mutable audit columns (updated_at, updated_by) on the discrepancy row | Compliance requires provable immutability. Mutable columns can be silently overwritten by a migration script or admin query. The ledger_events table supports INSERT only, with BIGSERIAL sequence numbers that make any gap or modification detectable. |
| Escalation rules as tenant-scoped database rows | Hardcoded escalation logic in application code | Each tenant operates under different compliance policies. Data-driven rules (severity match, time threshold, action type) let tenants configure their own escalation without code deployments. A wildcard severity match (*) supports catch-all rules for new tenants. |
| NATS JetStream over Kafka | Kafka or RabbitMQ for event ingestion | Production memory budget: 256MB for the message broker. Kafka requires 1-2GB minimum for ZooKeeper plus broker. NATS runs in a single process with durable pull consumers and manual acknowledgment. Dead-letter after 3 failed attempts prevents one bad event from blocking the pipeline. |
| Feature-flagged outbound integrations | Mandatory dependencies on Notification Hub and RAG Platform | The ledger must operate independently. Both NOTIFICATION_HUB_ENABLED and RAG_FEED_ENABLED default to false. When either service is unreachable, the ledger continues recording events. Downstream failures never block audit trail writes. |
| Single Go binary with 6 concurrent goroutines | Microservice split (API server + background workers) | All goroutines share the database connection pool and run in one process. No inter-service communication overhead. No deployment coordination between services. One binary to build, deploy, and monitor. |
| Async report generation (202 Accepted) | Synchronous report endpoint | Compliance reports can include up to 10,000 events across 100 discrepancies. Rendering HTML templates and converting to PDF takes seconds to minutes depending on volume. A synchronous endpoint would exceed the 15-second HTTP write timeout. |
Ecosystem Integration
Discrepancy events arrive from a reconciliation engine that matches transactions across Stripe, PayPal, and bank statements. When a match fails or a timing gap appears, the engine publishes to NATS JetStream and the compliance ledger picks it up. Full breakdown of the reconciliation architecture: www.kingsleyonoh.com/projects/transaction-reconciliation-engine
Escalation alerts route through a notification hub that handles channel selection (email, Telegram, in-app) and delivery tracking, so the compliance ledger fires a single HTTP event and doesn't manage delivery itself. Resolved discrepancies feed into a RAG platform where compliance teams query historical patterns using natural language. Both outbound connections are feature-flagged: the ledger records events and enforces escalation regardless of whether downstream services respond.
Notification routing and delivery: www.kingsleyonoh.com/projects/event-driven-notification-hub Compliance history search: www.kingsleyonoh.com/projects/multi-agent-rag-platform
Results
Before the system: discrepancy tracking lived in shared documents with manual email escalation. Status updates overwrote previous values. Audit history depended on email threads that could be deleted or forwarded selectively. Escalation happened when someone remembered to follow up.
After: every discrepancy gets an immutable event trail from the moment it's detected. The state machine enforces valid transitions (you can't resolve a discrepancy that hasn't been investigated). Escalation rules evaluate automatically every 15 minutes. Compliance reports generate on demand with the complete event history for any date range.
The system: 6 database tables, 22 API endpoints, 346 passing tests, and 6 background goroutines running in a single Go binary. Built and tested in 19 commits over 3 days. Amounts stored as DECIMAL(15,2) to avoid the floating-point rounding problems that plague financial systems using IEEE 754 floats.
The metric that matters is the answer to one question: "can you prove this audit trail wasn't modified?" With an append-only event table that doesn't support UPDATE or DELETE, the proof is structural. The mechanism for tampering doesn't exist in the codebase.