BrightAssist is deploying a customer-support generative AI model that answers billing and account questions in a fintech app. In offline evaluation, the model produces fluent responses, but the trust and safety team found cases of incorrect financial advice, policy violations, and unsafe escalation handling.
| Metric | Current Model | Target | Notes |
|---|---|---|---|
| Helpfulness pass rate | 78% | 85% | Human-rated on 1,200 prompts |
| Factual accuracy | 74% | 90% | Grounded against internal KB |
| Policy safety pass rate | 96.2% | 99.0% | Includes harmful/regulated content checks |
| Hallucination rate | 11.5% | 5.0% | Unsupported claims in final answer |
| Escalation recall | 68% | 90% | Cases that should be handed to a human |
| Over-refusal rate | 14% | 7% | Safe requests incorrectly declined |
| Avg. response latency | 2.1s | 2.5s | Within SLA |
Leadership wants a practical evaluation framework that measures both answer quality and safety before launch. The current metrics suggest the model is usable for simple requests but unreliable in high-risk situations, especially where escalation or factual grounding is required.