Context
FinSure, a consumer insurance app, is launching an LLM assistant that answers policy questions and drafts support replies. The assistant must reduce hallucinations and avoid exposing sensitive user inputs such as SSNs, claim IDs, medical details, and payment data.
Constraints
- p95 latency: 1,200ms end-to-end
- Cost ceiling: $8K/month at 300K requests/month
- Hallucination rate: <2% on a 250-question golden set
- Prompt-injection success rate: <1% on adversarial tests
- Sensitive input leakage in logs or model output: 0 tolerated incidents
- The assistant should answer directly when grounded, ask one clarifying question when needed, and refuse when evidence is missing
Available Resources
- User messages from support chat, often containing PII and free-form descriptions
- Retrieved policy snippets from an internal knowledge base (top-5 passages, already permission-filtered)
- One approved LLM provider (OpenAI or Anthropic)
- A lightweight PII redaction service available before model invocation
- Historical labeled transcripts: 5,000 examples with tags for correct answer, hallucination, refusal, and privacy violation
Task
- Design a production-ready system prompt that minimizes hallucinations, treats retrieved policy text as the only factual source, and prevents the model from repeating or storing unnecessary sensitive user data.
- Explain how you would structure the prompt inputs, including user message, redacted fields, retrieved policy snippets, and explicit refusal / clarification behavior.
- Define an evaluation plan first: offline tests for hallucination, privacy leakage, and prompt injection; then online metrics for quality and safety after launch.
- Propose the serving architecture around the prompt, including redaction, retrieval packaging, output validation, and logging controls.
- Estimate cost and latency, and describe the main tradeoffs between stricter safety rules, answer helpfulness, and operational cost.