Design Real-Time Support Chatbot

Context

ShopWave, a mid-market e-commerce platform, wants an LLM-powered chatbot for real-time customer engagement across web and mobile. The bot should answer order-status, returns, billing, and product-policy questions, and escalate to a human agent when confidence is low or the request is sensitive.

Constraints

p95 end-to-end latency: 2,500ms for standard Q&A, 4,000ms for tool-backed order lookups
Cost ceiling: $35K/month at 1.2M conversations/month
Hallucination ceiling: <2% on policy and account-related answers
Must resist prompt injection from user messages and retrieved content
Must not expose PII or account data without authentication and authorization
Responses should be grounded in approved help-center and policy content, with citations for factual claims

Available Resources

120K help-center articles, return/shipping policies, product FAQs, and agent macros
Structured tools: get_order_status(order_id, user_id), create_return(order_id, item_id), handoff_to_agent(reason)
Conversation logs from the current rules-based chatbot, including CSAT and escalation outcomes
Approved models: a fast small model for classification/routing and a stronger model for grounded answer generation
Existing hybrid search stack (BM25 + vector search) and a reranker service

Task

Design the end-to-end chatbot architecture, including intent classification, retrieval, tool use, escalation logic, and safety controls.
Write the system prompt for the answer-generation stage so the bot stays grounded, asks clarifying questions when needed, and refuses unsupported claims.
Define an evaluation plan before implementation: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics.
Estimate latency and cost at target volume, and explain how you would stay within both budgets.
Identify the top failure modes in production and propose mitigations, monitoring, and rollback criteria.

Constraints

p95 end-to-end latency: 2,500ms for standard Q&A, 4,000ms for tool-backed order lookups

Cost ceiling: $35K/month at 1.2M conversations/month

Hallucination ceiling: <2% on policy and account-related answers

Must resist prompt injection from user messages and retrieved content

Must not expose PII or account data without authentication and authorization

Responses should be grounded in approved help-center and policy content, with citations for factual claims

Available Resources

120K help-center articles, return/shipping policies, product FAQs, and agent macros

Structured tools: get_order_status(order_id, user_id), create_return(order_id, item_id), handoff_to_agent(reason)

Conversation logs from the current rules-based chatbot, including CSAT and escalation outcomes

Approved models: a fast small model for classification/routing and a stronger model for grounded answer generation

Existing hybrid search stack (BM25 + vector search) and a reranker service

Task

Design the end-to-end chatbot architecture, including intent classification, retrieval, tool use, escalation logic, and safety controls.

Write the system prompt for the answer-generation stage so the bot stays grounded, asks clarifying questions when needed, and refuses unsupported claims.

Define an evaluation plan before implementation: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics.

Estimate latency and cost at target volume, and explain how you would stay within both budgets.

Identify the top failure modes in production and propose mitigations, monitoring, and rollback criteria.

Constraints

p95 end-to-end latency: 2,500ms for standard Q&A, 4,000ms for tool-backed order lookups

Cost ceiling: $35K/month at 1.2M conversations/month

Hallucination ceiling: <2% on policy and account-related answers

Must resist prompt injection from user messages and retrieved content

Must not expose PII or account data without authentication and authorization

Responses should be grounded in approved help-center and policy content, with citations for factual claims

Available Resources

120K help-center articles, return/shipping policies, product FAQs, and agent macros

Structured tools: get_order_status(order_id, user_id), create_return(order_id, item_id), handoff_to_agent(reason)

Conversation logs from the current rules-based chatbot, including CSAT and escalation outcomes

Approved models: a fast small model for classification/routing and a stronger model for grounded answer generation

Existing hybrid search stack (BM25 + vector search) and a reranker service

Task

Design the end-to-end chatbot architecture, including intent classification, retrieval, tool use, escalation logic, and safety controls.

Write the system prompt for the answer-generation stage so the bot stays grounded, asks clarifying questions when needed, and refuses unsupported claims.

Define an evaluation plan before implementation: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics.

Estimate latency and cost at target volume, and explain how you would stay within both budgets.

Identify the top failure modes in production and propose mitigations, monitoring, and rollback criteria.

Constraints

p95 end-to-end latency: 2,500ms for standard Q&A, 4,000ms for tool-backed order lookups

Cost ceiling: $35K/month at 1.2M conversations/month

Hallucination ceiling: <2% on policy and account-related answers

Must resist prompt injection from user messages and retrieved content

Must not expose PII or account data without authentication and authorization

Responses should be grounded in approved help-center and policy content, with citations for factual claims

Available Resources

120K help-center articles, return/shipping policies, product FAQs, and agent macros

Structured tools: get_order_status(order_id, user_id), create_return(order_id, item_id), handoff_to_agent(reason)

Conversation logs from the current rules-based chatbot, including CSAT and escalation outcomes

Approved models: a fast small model for classification/routing and a stronger model for grounded answer generation

Existing hybrid search stack (BM25 + vector search) and a reranker service

Task

Design the end-to-end chatbot architecture, including intent classification, retrieval, tool use, escalation logic, and safety controls.

Write the system prompt for the answer-generation stage so the bot stays grounded, asks clarifying questions when needed, and refuses unsupported claims.

Define an evaluation plan before implementation: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics.

Estimate latency and cost at target volume, and explain how you would stay within both budgets.

Identify the top failure modes in production and propose mitigations, monitoring, and rollback criteria.

Interview Guides

Context

Constraints

Available Resources

Task

Design Real-Time Support Chatbot

Context

Constraints

Available Resources

Task

Your Answer

Design Real-Time Support Chatbot

Context

Constraints

Available Resources

Task

Design Real-Time Support Chatbot

Context

Constraints

Available Resources

Task

Your Answer