Enterprise EngagementFinancial Services / Compliance Jan 202610 weeks discovery → audit sign-off

Audit-grade compliance review ships under multi-layer guardrails

Defence-in-depth controls for regulated document review.

Regulated financial-services intermediary · India · 95 employees

AI Guardrails & GovernanceAI ObservabilityEnterprise AI Automation
0
audit findings across 4 quarterly reviews
3.2×
throughput per reviewer
< 6 hrs
customer activation time
10 wks
engagement, discovery to audit sign-off
Business problem

What the team was actually solving

Manual compliance review of vendor and onboarding documents was the bottleneck for new-customer activation. Every traffic spike threatened SLA breach. Reviewer fatigue led to inconsistent flagging — some weeks too strict, some weeks too loose, with no defensible pattern.

Existing workflow

Where the old process broke

  • 1Customer activation average of 36+ hours, mostly waiting on compliance
  • 2Reviewer fatigue producing inconsistent flag decisions
  • 3No machine-readable audit trail — every decision lived in a reviewer's head
  • 4Quarterly internal audits surfaced inconsistencies the team could not defend
Proposed solution

The AI / technical solution we shipped

A single-agent system wrapped in four guardrail layers: an input filter that detects and redacts PII / strips prompt-injection patterns; a versioned policy registry the agent must cite by clause ID for every conclusion; output validators (schema + LLM-as-judge cross-check); and a human-in-the-loop gate on anything scored above a defined risk threshold. Every decision is appended to an immutable audit log.

Technology

Technology stack

Input filterCustom detectors · PII / PHI redaction · injection heuristicsReasoning modelClaude Opus 4.7 (final ruling) · Haiku 4.5 (filter pass)Policy registryVersioned in repo · clause IDs · referenced from promptsOutput validatorsPydantic v2 · regex pack · LLM-judge cross-checkAuditAppend-only PostgreSQL ledger · replay tooling
Integration

Integration approach

Documents flow in from the customer's onboarding system. The agent's draft decision (with clause citations) appears in the existing compliance dashboard. Reviewers see the agent's reasoning and either approve, override, or escalate — all logged.

  • Onboarding system webhook → agent service
  • Compliance dashboard: existing UI, agent draft inline
  • Policy registry: in-repo, versioned, clause-IDed
  • Audit log: PostgreSQL append-only ledger, queryable by auditors
Security & scalability

Security & scalability

PII minimisation

Sensitive identifiers are redacted before reaching the reasoning model wherever the workflow allows. The audit log stores redacted versions; full versions remain only in the customer's source system.

Prompt-injection defence

Inputs containing instruction-like patterns are detected and quarantined. The agent prompt explicitly tells the model not to act on instructions embedded in user content.

Versioned policy registry

Every conclusion cites a clause ID. Policy versions are explicit; historic decisions can be re-run against the current registry to detect drift.

Append-only audit log

Decisions are written to an immutable ledger. Replay tooling lets auditors reproduce any decision from logged inputs.

Defence in depth

Multi-layer controls

Defence-in-depth — what blocks what
Agent
  1. L1
    Audit log
    Immutable trace of every decision; replayable; cites policy clauses by ID.
  2. L2
    Human gate
    Risk-scored escalation above threshold; approval queue with sign-off.
  3. L3
    Output validators
    Pydantic + regex + LLM-judge cross-check before any external action.
  4. L4
    Policy registry
    Versioned policy clauses; agent must cite a clause for any conclusion.
  5. L5
    Input filter
    PII redaction, prompt-injection blocking, jailbreak heuristics.
Methodology

Delivery process

01

Threat model (1 wk)

Failure modes enumerated: PII leakage, prompt injection, policy mis-citation, output-schema bypass. Each mapped to the layer that catches it.

02

Policy registry (2 wks)

Existing policy library externalised as versioned, citable clauses. Format reviewed with compliance team.

03

Layered implementation (3 wks)

Input filter → policy-grounded prompt → output validator → human gate. Each layer tested adversarially before the next is wired in.

04

Adversarial eval (2 wks)

Jailbreak attempts, PII smuggling, policy-conflict cases. Eval scoring at every layer.

05

Audit sign-off (1 wk)

Threat model, policy registry, eval results, decision-trace samples reviewed with internal audit. Sign-off achieved.

06

Quarterly revalidation

Adversarial eval re-run quarterly; policy registry refreshed; model version log updated.

Observability

Observability

Every layer of the guardrail stack emits a structured event. Auditors and compliance reviewers see the full chain — input transformations, policy clauses cited, validator passes, human approver (if any), final ruling — in a single dashboard.

  • Per-decision trace with all layer events
  • Override + escalation rate as leading drift indicators
  • Quarterly adversarial eval results published to the audit team
  • Replay tooling: any historic decision can be re-run end-to-end
Before vs after

Before vs after

Before
  • Customer activation time36+ hrs
  • Documents per reviewer per dayBaseline (1×)
  • Audit findings on decisionsMultiple per quarter
  • Decision traceabilityIn reviewer's head
After
  • Customer activation time< 6 hrs
  • Documents per reviewer per day3.2×
  • Audit findings on decisions0 across 4 quarters
  • Decision traceability100%, clause-cited
Automation impact

Automation impact

Reviewers no longer do first-pass classification — they review the agent's draft ruling with the cited clauses, confirm or override, and move on. The work that remains is the high-judgement work that genuinely needs human attention.

3.2×
documents processed per reviewer per day
< 6 hrs
average activation time (was 36+ hrs)
100%
of decisions cite a specific policy clause
Business outcomes

Business outcomes

Reviewer throughput tripled without adding headcount. Audit findings on AI-assisted decisions: zero across four quarterly reviews. The activation-time reduction also lifted top-of-funnel conversion measurably.

0
audit findings across 4 quarterly reviews
3.2×
throughput per reviewer
< 6 hrs
customer activation time
10 wks
engagement, discovery to audit sign-off
Lessons learned

What we'd tell another team building this

  • 01Layered guardrails work. A jailbreak that bypassed the input filter still had to pass the policy-citation requirement, the output validator, and the human gate above threshold. No single failure compromised the system.
  • 02The policy registry was the most valuable artefact produced. Even outside the agent context, having policy clauses externalised and versioned changed how the team reviewed decisions.
  • 03Adversarial eval cases produced the largest accuracy jumps. Happy-path evals gave false confidence; the cases that mattered were the malformed, ambiguous, and conflicting ones.
What's next

Future scalability

The guardrail pattern carries across regulated workflows. The same layered architecture now backs the customer's claims-triage pipeline and is being scoped for transaction-review.

  • Claims-triage agent reusing input filter + policy registry
  • Transaction-review agent in scoping
  • Policy registry promoted to a shared org-wide compliance artefact
  • Quarterly eval refresh as a permanent governance ritual

Have a regulated workflow you want to safely automate?

Most regulated AI projects fail at audit, not at build. A scoping session walks through the threat model, policy-registry shape, and approval workflow your auditors will ask for.