Design a Safe Grounded Support Prompt

Context

FinSure, a consumer insurance app, is launching an LLM assistant that answers policy questions and drafts support replies. The assistant must reduce hallucinations and avoid exposing sensitive user inputs such as SSNs, claim IDs, medical details, and payment data.

Constraints

p95 latency: 1,200ms end-to-end
Cost ceiling: $8K/month at 300K requests/month
Hallucination rate: <2% on a 250-question golden set
Prompt-injection success rate: <1% on adversarial tests
Sensitive input leakage in logs or model output: 0 tolerated incidents
The assistant should answer directly when grounded, ask one clarifying question when needed, and refuse when evidence is missing

Available Resources

User messages from support chat, often containing PII and free-form descriptions
Retrieved policy snippets from an internal knowledge base (top-5 passages, already permission-filtered)
One approved LLM provider (OpenAI or Anthropic)
A lightweight PII redaction service available before model invocation
Historical labeled transcripts: 5,000 examples with tags for correct answer, hallucination, refusal, and privacy violation

Task

Design a production-ready system prompt that minimizes hallucinations, treats retrieved policy text as the only factual source, and prevents the model from repeating or storing unnecessary sensitive user data.
Explain how you would structure the prompt inputs, including user message, redacted fields, retrieved policy snippets, and explicit refusal / clarification behavior.
Define an evaluation plan first: offline tests for hallucination, privacy leakage, and prompt injection; then online metrics for quality and safety after launch.
Propose the serving architecture around the prompt, including redaction, retrieval packaging, output validation, and logging controls.
Estimate cost and latency, and describe the main tradeoffs between stricter safety rules, answer helpfulness, and operational cost.

Constraints

p95 latency: 1,200ms end-to-end

Cost ceiling: $8K/month at 300K requests/month

Hallucination rate: <2% on a 250-question golden set

Prompt-injection success rate: <1% on adversarial tests

Sensitive input leakage in logs or model output: 0 tolerated incidents

The assistant should answer directly when grounded, ask one clarifying question when needed, and refuse when evidence is missing

Available Resources

User messages from support chat, often containing PII and free-form descriptions

Retrieved policy snippets from an internal knowledge base (top-5 passages, already permission-filtered)

One approved LLM provider (OpenAI or Anthropic)

A lightweight PII redaction service available before model invocation

Historical labeled transcripts: 5,000 examples with tags for correct answer, hallucination, refusal, and privacy violation

Task

Design a production-ready system prompt that minimizes hallucinations, treats retrieved policy text as the only factual source, and prevents the model from repeating or storing unnecessary sensitive user data.

Explain how you would structure the prompt inputs, including user message, redacted fields, retrieved policy snippets, and explicit refusal / clarification behavior.

Define an evaluation plan first: offline tests for hallucination, privacy leakage, and prompt injection; then online metrics for quality and safety after launch.

Propose the serving architecture around the prompt, including redaction, retrieval packaging, output validation, and logging controls.

Estimate cost and latency, and describe the main tradeoffs between stricter safety rules, answer helpfulness, and operational cost.

Constraints

p95 latency: 1,200ms end-to-end

Cost ceiling: $8K/month at 300K requests/month

Hallucination rate: <2% on a 250-question golden set

Prompt-injection success rate: <1% on adversarial tests

Sensitive input leakage in logs or model output: 0 tolerated incidents

The assistant should answer directly when grounded, ask one clarifying question when needed, and refuse when evidence is missing

Available Resources

User messages from support chat, often containing PII and free-form descriptions

Retrieved policy snippets from an internal knowledge base (top-5 passages, already permission-filtered)

One approved LLM provider (OpenAI or Anthropic)

A lightweight PII redaction service available before model invocation

Historical labeled transcripts: 5,000 examples with tags for correct answer, hallucination, refusal, and privacy violation

Task

Explain how you would structure the prompt inputs, including user message, redacted fields, retrieved policy snippets, and explicit refusal / clarification behavior.

Define an evaluation plan first: offline tests for hallucination, privacy leakage, and prompt injection; then online metrics for quality and safety after launch.

Propose the serving architecture around the prompt, including redaction, retrieval packaging, output validation, and logging controls.

Estimate cost and latency, and describe the main tradeoffs between stricter safety rules, answer helpfulness, and operational cost.

Constraints

p95 latency: 1,200ms end-to-end

Cost ceiling: $8K/month at 300K requests/month

Hallucination rate: <2% on a 250-question golden set

Prompt-injection success rate: <1% on adversarial tests

Sensitive input leakage in logs or model output: 0 tolerated incidents

The assistant should answer directly when grounded, ask one clarifying question when needed, and refuse when evidence is missing

Available Resources

User messages from support chat, often containing PII and free-form descriptions

Retrieved policy snippets from an internal knowledge base (top-5 passages, already permission-filtered)

One approved LLM provider (OpenAI or Anthropic)

A lightweight PII redaction service available before model invocation

Historical labeled transcripts: 5,000 examples with tags for correct answer, hallucination, refusal, and privacy violation

Task

Explain how you would structure the prompt inputs, including user message, redacted fields, retrieved policy snippets, and explicit refusal / clarification behavior.

Define an evaluation plan first: offline tests for hallucination, privacy leakage, and prompt injection; then online metrics for quality and safety after launch.

Propose the serving architecture around the prompt, including redaction, retrieval packaging, output validation, and logging controls.

Estimate cost and latency, and describe the main tradeoffs between stricter safety rules, answer helpfulness, and operational cost.

Interview Guides

Context

Constraints

Available Resources

Task

Design a Safe Grounded Support Prompt

Context

Constraints

Available Resources

Task

Your Answer

Design a Safe Grounded Support Prompt

Context

Constraints

Available Resources

Task

Design a Safe Grounded Support Prompt

Context

Constraints

Available Resources

Task

Your Answer