Context
BrightDesk is launching an LLM-powered customer support assistant that answers product, billing, and policy questions inside its web app. Early pilots show the assistant is fluent but occasionally invents features, refund rules, or troubleshooting steps.
Constraints
- p95 latency: 1,500ms per response
- Cost ceiling: $8,000/month at 200K requests/month
- Hallucination rate: <2% on a labeled customer-support golden set
- Unsafe compliance advice rate: 0%
- The assistant must prefer refusal or escalation over guessing
- The system must be resilient to prompt injection in retrieved documents or user messages
Available Resources
- 12,000 help-center articles, release notes, refund policies, and internal support macros
- 6 months of anonymized support tickets with agent-written resolutions
- Approved models: GPT-4.1-mini for generation, GPT-4.1 for offline judging, and
text-embedding-3-large for retrieval
- Existing keyword search and a managed vector database
- 500 manually reviewed QA pairs from the support team for evaluation
Task
- Identify the most common causes of hallucinations in this setting, separating retrieval issues, prompt issues, model behavior, stale knowledge, and user-input risks.
- Propose an eval-first mitigation plan, including offline and online metrics, before describing architecture changes.
- Design the prompting and retrieval approach so the assistant only answers from approved sources, cites evidence, and refuses when unsupported.
- Estimate cost and latency for your proposed solution, and explain what you would change if the system misses either budget.
- List the top failure modes, including prompt injection and unsupported answers, and explain how you would detect and mitigate each one.
Your answer should be practical: assume this needs to ship in 6 weeks, with one ML engineer and one platform engineer. Favor simple controls that materially reduce hallucinations over complex research-heavy approaches.