Design Customer Support AI Assistant

Context

BrightDesk, a SaaS helpdesk platform, wants an LLM-powered assistant for support agents handling live customer chats. The assistant should draft grounded responses, summarize account context, and suggest next actions so agents can respond faster without giving incorrect policy or product guidance.

Constraints

p95 latency: 1,500ms per assistant turn
Cost ceiling: $12K/month at 300K assistant turns
Accuracy bar: at least 85% of drafts rated "acceptable without major edits" on an internal review set
Hallucination ceiling: fewer than 2% of responses may contain unsupported product or policy claims
Safety: must resist prompt injection from pasted customer text, must not leak PII across accounts, and must refuse unsupported billing/legal claims

Available Resources

40K help center articles, internal SOPs, and policy docs
CRM metadata for the active customer: plan tier, open tickets, product usage summary, and recent chat history
Approved models: GPT-4.1-mini for generation, a smaller embedding model for retrieval
Existing search stack supports BM25 and vector search
Historical support conversations with agent edits and CSAT outcomes

Task

Design a practical LLM solution for this customer-facing workflow, including prompt design and when to use retrieval versus account context.
Define an evaluation plan first: offline golden set, hallucination measurement, and online success metrics after launch.
Propose safeguards for prompt injection, unsupported answers, and PII handling in a live support environment.
Estimate cost and latency at the target volume, and explain the main tradeoffs.
Describe how you would measure whether your prior AI experience actually improved customer outcomes, not just model quality.

Your answer should be concrete. Assume you are the engineer responsible for shipping an MVP in six weeks with one product manager and two backend engineers.

Context

Constraints

p95 latency: 1,500ms per assistant turn
Cost ceiling: $12K/month at 300K assistant turns
Accuracy bar: at least 85% of drafts rated "acceptable without major edits" on an internal review set
Hallucination ceiling: fewer than 2% of responses may contain unsupported product or policy claims
Safety: must resist prompt injection from pasted customer text, must not leak PII across accounts, and must refuse unsupported billing/legal claims

Available Resources

40K help center articles, internal SOPs, and policy docs
CRM metadata for the active customer: plan tier, open tickets, product usage summary, and recent chat history
Approved models: GPT-4.1-mini for generation, a smaller embedding model for retrieval
Existing search stack supports BM25 and vector search
Historical support conversations with agent edits and CSAT outcomes

Task

Design a practical LLM solution for this customer-facing workflow, including prompt design and when to use retrieval versus account context.
Define an evaluation plan first: offline golden set, hallucination measurement, and online success metrics after launch.
Propose safeguards for prompt injection, unsupported answers, and PII handling in a live support environment.
Estimate cost and latency at the target volume, and explain the main tradeoffs.
Describe how you would measure whether your prior AI experience actually improved customer outcomes, not just model quality.

Your answer should be concrete. Assume you are the engineer responsible for shipping an MVP in six weeks with one product manager and two backend engineers.

Context

Constraints

p95 latency: 1,500ms per assistant turn
Cost ceiling: $12K/month at 300K assistant turns
Accuracy bar: at least 85% of drafts rated "acceptable without major edits" on an internal review set
Hallucination ceiling: fewer than 2% of responses may contain unsupported product or policy claims
Safety: must resist prompt injection from pasted customer text, must not leak PII across accounts, and must refuse unsupported billing/legal claims

Available Resources

40K help center articles, internal SOPs, and policy docs
CRM metadata for the active customer: plan tier, open tickets, product usage summary, and recent chat history
Approved models: GPT-4.1-mini for generation, a smaller embedding model for retrieval
Existing search stack supports BM25 and vector search
Historical support conversations with agent edits and CSAT outcomes

Task

Design a practical LLM solution for this customer-facing workflow, including prompt design and when to use retrieval versus account context.
Define an evaluation plan first: offline golden set, hallucination measurement, and online success metrics after launch.
Propose safeguards for prompt injection, unsupported answers, and PII handling in a live support environment.
Estimate cost and latency at the target volume, and explain the main tradeoffs.
Describe how you would measure whether your prior AI experience actually improved customer outcomes, not just model quality.

Your answer should be concrete. Assume you are the engineer responsible for shipping an MVP in six weeks with one product manager and two backend engineers.

Context

Constraints

p95 latency: 1,500ms per assistant turn
Cost ceiling: $12K/month at 300K assistant turns
Accuracy bar: at least 85% of drafts rated "acceptable without major edits" on an internal review set
Hallucination ceiling: fewer than 2% of responses may contain unsupported product or policy claims
Safety: must resist prompt injection from pasted customer text, must not leak PII across accounts, and must refuse unsupported billing/legal claims

Available Resources

40K help center articles, internal SOPs, and policy docs
CRM metadata for the active customer: plan tier, open tickets, product usage summary, and recent chat history
Approved models: GPT-4.1-mini for generation, a smaller embedding model for retrieval
Existing search stack supports BM25 and vector search
Historical support conversations with agent edits and CSAT outcomes

Task

Design a practical LLM solution for this customer-facing workflow, including prompt design and when to use retrieval versus account context.
Define an evaluation plan first: offline golden set, hallucination measurement, and online success metrics after launch.
Propose safeguards for prompt injection, unsupported answers, and PII handling in a live support environment.
Estimate cost and latency at the target volume, and explain the main tradeoffs.
Describe how you would measure whether your prior AI experience actually improved customer outcomes, not just model quality.

Your answer should be concrete. Assume you are the engineer responsible for shipping an MVP in six weeks with one product manager and two backend engineers.

Interview Guides

Context

Constraints

Available Resources

Task

Design Customer Support AI Assistant

Context

Constraints

Available Resources

Task

Your Answer

Design Customer Support AI Assistant

Context

Constraints

Available Resources

Task

Design Customer Support AI Assistant

Context

Constraints

Available Resources

Task

Your Answer