Improve Support Satisfaction with RAG

Context

BrightCart, an e-commerce platform for SMB merchants, wants an AI assistant inside its customer support chat to improve customer satisfaction for order, refund, shipping, and account questions. The current bot has fast response times but low trust because it sometimes gives confident but incorrect policy answers.

Constraints

p95 latency: 2,500ms end-to-end
Cost ceiling: $12K/month at 400K support conversations/month
Customer satisfaction target: +4 point lift in post-chat CSAT
Hallucination ceiling: <2% on policy and refund questions
Escalation rate must not increase by more than 1 percentage point
Must resist prompt injection from user messages and retrieved documents
Must not expose PII or internal-only policy notes

Available Resources

80K help-center articles, policy pages, and macro templates
2 years of anonymized support tickets with CSAT, escalation outcome, and resolution tags
Structured order metadata (order status, shipment ETA, refund eligibility)
Approved models: GPT-4.1-mini for generation, text-embedding-3-large for embeddings
Existing hybrid search stack (BM25 + vector search)

Task

Design an LLM-powered support assistant that uses retrieval and available structured data to improve customer satisfaction while staying within latency and cost limits.
Write the system prompt for grounded answers, safe refusal behavior, and clear escalation rules when the answer is uncertain or policy-sensitive.
Define an evaluation plan before architecture: include offline quality and safety evaluation, plus online metrics tied to CSAT, containment, and escalation.
Estimate request-level and monthly cost/latency, and explain what you would change if the system misses either budget.
Identify key failure modes, especially hallucinated policy answers, prompt injection, stale policies, and PII leakage, and propose mitigations.

Context

Constraints

p95 latency: 2,500ms end-to-end
Cost ceiling: $12K/month at 400K support conversations/month
Customer satisfaction target: +4 point lift in post-chat CSAT
Hallucination ceiling: <2% on policy and refund questions
Escalation rate must not increase by more than 1 percentage point
Must resist prompt injection from user messages and retrieved documents
Must not expose PII or internal-only policy notes

Available Resources

80K help-center articles, policy pages, and macro templates
2 years of anonymized support tickets with CSAT, escalation outcome, and resolution tags
Structured order metadata (order status, shipment ETA, refund eligibility)
Approved models: GPT-4.1-mini for generation, text-embedding-3-large for embeddings
Existing hybrid search stack (BM25 + vector search)

Task

Design an LLM-powered support assistant that uses retrieval and available structured data to improve customer satisfaction while staying within latency and cost limits.
Write the system prompt for grounded answers, safe refusal behavior, and clear escalation rules when the answer is uncertain or policy-sensitive.
Define an evaluation plan before architecture: include offline quality and safety evaluation, plus online metrics tied to CSAT, containment, and escalation.
Estimate request-level and monthly cost/latency, and explain what you would change if the system misses either budget.
Identify key failure modes, especially hallucinated policy answers, prompt injection, stale policies, and PII leakage, and propose mitigations.

Context

Constraints

p95 latency: 2,500ms end-to-end
Cost ceiling: $12K/month at 400K support conversations/month
Customer satisfaction target: +4 point lift in post-chat CSAT
Hallucination ceiling: <2% on policy and refund questions
Escalation rate must not increase by more than 1 percentage point
Must resist prompt injection from user messages and retrieved documents
Must not expose PII or internal-only policy notes

Available Resources

80K help-center articles, policy pages, and macro templates
2 years of anonymized support tickets with CSAT, escalation outcome, and resolution tags
Structured order metadata (order status, shipment ETA, refund eligibility)
Approved models: GPT-4.1-mini for generation, text-embedding-3-large for embeddings
Existing hybrid search stack (BM25 + vector search)

Task

Design an LLM-powered support assistant that uses retrieval and available structured data to improve customer satisfaction while staying within latency and cost limits.
Write the system prompt for grounded answers, safe refusal behavior, and clear escalation rules when the answer is uncertain or policy-sensitive.
Define an evaluation plan before architecture: include offline quality and safety evaluation, plus online metrics tied to CSAT, containment, and escalation.
Estimate request-level and monthly cost/latency, and explain what you would change if the system misses either budget.
Identify key failure modes, especially hallucinated policy answers, prompt injection, stale policies, and PII leakage, and propose mitigations.

Context

Constraints

p95 latency: 2,500ms end-to-end
Cost ceiling: $12K/month at 400K support conversations/month
Customer satisfaction target: +4 point lift in post-chat CSAT
Hallucination ceiling: <2% on policy and refund questions
Escalation rate must not increase by more than 1 percentage point
Must resist prompt injection from user messages and retrieved documents
Must not expose PII or internal-only policy notes

Available Resources

80K help-center articles, policy pages, and macro templates
2 years of anonymized support tickets with CSAT, escalation outcome, and resolution tags
Structured order metadata (order status, shipment ETA, refund eligibility)
Approved models: GPT-4.1-mini for generation, text-embedding-3-large for embeddings
Existing hybrid search stack (BM25 + vector search)

Task

Design an LLM-powered support assistant that uses retrieval and available structured data to improve customer satisfaction while staying within latency and cost limits.
Write the system prompt for grounded answers, safe refusal behavior, and clear escalation rules when the answer is uncertain or policy-sensitive.
Define an evaluation plan before architecture: include offline quality and safety evaluation, plus online metrics tied to CSAT, containment, and escalation.
Estimate request-level and monthly cost/latency, and explain what you would change if the system misses either budget.
Identify key failure modes, especially hallucinated policy answers, prompt injection, stale policies, and PII leakage, and propose mitigations.

Interview Guides

Context

Constraints

Available Resources

Task

Improve Support Satisfaction with RAG

Context

Constraints

Available Resources

Task

Your Answer

Improve Support Satisfaction with RAG

Context

Constraints

Available Resources

Task

Improve Support Satisfaction with RAG

Context

Constraints

Available Resources

Task

Your Answer