Context
StripeFlow wants a customer-facing AI assistant inside its merchant dashboard to answer questions about payments, disputes, refunds, payout timing, and API error messages. The assistant should reduce support tickets while staying tightly grounded in official payments documentation and account-specific policy metadata.
Constraints
- p95 latency: 2,500ms end-to-end
- Cost ceiling: $0.03 per request and $45K/month at 1.5M requests
- Hallucination ceiling: <1.5% on a labeled payments support set
- Unsafe action rate: 0% for invented compliance, legal, or account-status claims
- Must resist prompt injection from retrieved docs, user messages, and merchant-provided text
- Must not reveal PII, card data, internal-only policies, or unsupported account details
- If the answer is not grounded, the assistant must refuse or route to human support
Available Data / Models
- Public help center articles, API docs, dispute playbooks, and payout policy docs
- Structured merchant metadata: country, payment methods enabled, payout schedule, dispute status, risk tier, and recent error codes
- Search stack with BM25 and vector search over chunked documents
- Access to an approved LLM provider (OpenAI or Anthropic) and a reranker model
- 1,200 historical support conversations with human-written resolutions for evaluation only
Deliverables
- Design the prompt and retrieval strategy for a customer-facing payments assistant that answers grounded questions and cites sources where appropriate.
- Define an eval-first plan: offline evaluation before launch and online metrics after launch, including hallucination, refusal quality, and prompt-injection robustness.
- Propose the end-to-end architecture, including chunking, retrieval, reranking, use of structured merchant metadata, and fallback behavior.
- Estimate cost and latency at target volume, and explain how you would stay within both constraints.
- Identify key failure modes specific to payments support, including compliance-sensitive answers, stale policy retrieval, and account-specific overreach.