Business Context
You’re on the Applied NLP team at MercuryPay, a fintech that provides card issuing and payment processing for 120K SMBs and mid-market platforms. MercuryPay’s Risk & Compliance org reviews approximately 18,000 investigation cases/day (chargebacks, AML alerts, sanctions screening hits, and merchant fraud). Each case can include a long “case bundle”: analyst notes, customer emails, KYC documents, transaction narratives, and chat logs. Compliance officers need a single, defensible summary to decide whether to file a SAR, freeze funds, or clear the case.
A major pain point: the LLM summarizer frequently fails when the case bundle exceeds the model’s context window, leading to missing key facts (e.g., beneficiary names, amounts, dates) and inconsistent summaries. These errors have high stakes: false clears can cause regulatory exposure; false escalations can block legitimate merchants and impact revenue.
Data Characteristics
- Volume: ~3.5M historical cases (24 months), with 420K cases having human-written “final disposition summaries” used as reference.
- Text length: 0.5–60 pages per case bundle.
- Median: ~2,400 tokens
- P90: ~12,000 tokens
- P99: ~45,000 tokens
- Structure: semi-structured (headers, timestamps, email threads, tables of transactions) + unstructured narrative.
- Language: English (95%), Spanish/Portuguese (5%).
- Domain vocabulary: AML typologies (layering, mule accounts), sanctions terms, merchant category codes (MCC), payment rails, bank identifiers.
Success Criteria
- Faithfulness: No fabricated entities/amounts; all key claims trace to the bundle.
- Coverage: Include required fields (who/what/when/where/why) and top risk signals.
- Actionability: Summary supports a disposition decision in <60 seconds of reading.
- Latency: P95 end-to-end summarization < 8 seconds per case (batch + interactive review).
- Auditability: Provide citations to source snippets for key statements.
Constraints
- Context window: production model limited to 8k tokens.
- Security/Regulatory: data cannot leave MercuryPay VPC; must support retention and audit logging.
- Cost: average inference cost must stay under $0.03/case.
Requirements (Deliverables)
- Propose a robust strategy to summarize case bundles that exceed the context window (e.g., chunking + map-reduce, hierarchical summarization, retrieval-augmented summarization, or hybrid).
- Specify how you will chunk text (by structure, semantic boundaries, or token budget) and how you will handle cross-chunk dependencies (e.g., entity consistency, timeline coherence).
- Design a mechanism to ensure coverage of critical facts (names, accounts, amounts, dates, locations) and reduce omission risk.
- Produce summaries in a fixed schema:
- Case overview (1–2 sentences)
- Key entities (people/orgs/accounts)
- Transaction timeline (bulleted)
- Risk signals (bulleted)
- Recommended disposition (with confidence)
- Citations (source excerpt IDs)
- Provide an evaluation plan that measures faithfulness and coverage, including automated checks and human review.
- Provide a Python implementation sketch using
transformers + optional spaCy for entity extraction, including training/fine-tuning hooks (if any) and inference-time orchestration.