Design a Compliant Finance Assistant

Context

Intuit wants to add a generative AI assistant inside TurboTax and QuickBooks to answer user questions about tax guidance, bookkeeping workflows, and account-specific product help. The feature must operate in a regulated financial environment where incorrect or non-compliant answers can create legal, trust, and customer-support risk.

Constraints

p95 latency: 2,500ms for a grounded answer
Cost ceiling: $0.03 per request and $150K/month at 5M requests/month
Hallucination ceiling: <1% on a high-risk golden set for tax/compliance questions
Must cite approved sources for factual claims
Must refuse or escalate when the answer depends on missing user context, regulated advice boundaries, or unsupported claims
Must defend against prompt injection, PII leakage, and unauthorized retrieval across customer accounts

Available Resources

Approved corpora: TurboTax help center, IRS publications, Intuit policy docs, QuickBooks support articles, internal compliance-approved response templates
Structured metadata: doc version, jurisdiction, tax year, product surface, approval status, sensitivity label
Models: one high-quality LLM, one lower-cost LLM, embeddings model, reranker
Existing identity and authorization layer for QuickBooks/TurboTax users
Compliance team can label 1,000 high-risk prompts and review monthly regressions

Task

Design an eval-first LLM system for this assistant, including how you would measure factuality, refusal quality, prompt-injection robustness, and compliance before launch.
Propose the RAG architecture and prompt design needed to keep answers grounded in approved financial content while meeting latency and cost constraints.
Explain how you would mitigate the main deployment risks in a regulated environment: hallucinations, stale guidance, prompt injection, PII exposure, cross-tenant data leakage, and overconfident advice.
Define the online monitoring and rollout plan, including guardrails, escalation paths to human support or CPA/tax expert workflows, and rollback criteria.
Estimate cost/latency tradeoffs and identify where you would use smaller vs larger models.

Context

Constraints

p95 latency: 2,500ms for a grounded answer
Cost ceiling: $0.03 per request and $150K/month at 5M requests/month
Hallucination ceiling: <1% on a high-risk golden set for tax/compliance questions
Must cite approved sources for factual claims
Must refuse or escalate when the answer depends on missing user context, regulated advice boundaries, or unsupported claims
Must defend against prompt injection, PII leakage, and unauthorized retrieval across customer accounts

Available Resources

Approved corpora: TurboTax help center, IRS publications, Intuit policy docs, QuickBooks support articles, internal compliance-approved response templates
Structured metadata: doc version, jurisdiction, tax year, product surface, approval status, sensitivity label
Models: one high-quality LLM, one lower-cost LLM, embeddings model, reranker
Existing identity and authorization layer for QuickBooks/TurboTax users
Compliance team can label 1,000 high-risk prompts and review monthly regressions

Task

Design an eval-first LLM system for this assistant, including how you would measure factuality, refusal quality, prompt-injection robustness, and compliance before launch.
Propose the RAG architecture and prompt design needed to keep answers grounded in approved financial content while meeting latency and cost constraints.
Explain how you would mitigate the main deployment risks in a regulated environment: hallucinations, stale guidance, prompt injection, PII exposure, cross-tenant data leakage, and overconfident advice.
Define the online monitoring and rollout plan, including guardrails, escalation paths to human support or CPA/tax expert workflows, and rollback criteria.
Estimate cost/latency tradeoffs and identify where you would use smaller vs larger models.

Context

Constraints

p95 latency: 2,500ms for a grounded answer
Cost ceiling: $0.03 per request and $150K/month at 5M requests/month
Hallucination ceiling: <1% on a high-risk golden set for tax/compliance questions
Must cite approved sources for factual claims
Must refuse or escalate when the answer depends on missing user context, regulated advice boundaries, or unsupported claims
Must defend against prompt injection, PII leakage, and unauthorized retrieval across customer accounts

Available Resources

Approved corpora: TurboTax help center, IRS publications, Intuit policy docs, QuickBooks support articles, internal compliance-approved response templates
Structured metadata: doc version, jurisdiction, tax year, product surface, approval status, sensitivity label
Models: one high-quality LLM, one lower-cost LLM, embeddings model, reranker
Existing identity and authorization layer for QuickBooks/TurboTax users
Compliance team can label 1,000 high-risk prompts and review monthly regressions

Task

Design an eval-first LLM system for this assistant, including how you would measure factuality, refusal quality, prompt-injection robustness, and compliance before launch.
Propose the RAG architecture and prompt design needed to keep answers grounded in approved financial content while meeting latency and cost constraints.
Explain how you would mitigate the main deployment risks in a regulated environment: hallucinations, stale guidance, prompt injection, PII exposure, cross-tenant data leakage, and overconfident advice.
Define the online monitoring and rollout plan, including guardrails, escalation paths to human support or CPA/tax expert workflows, and rollback criteria.
Estimate cost/latency tradeoffs and identify where you would use smaller vs larger models.

Context

Constraints

p95 latency: 2,500ms for a grounded answer
Cost ceiling: $0.03 per request and $150K/month at 5M requests/month
Hallucination ceiling: <1% on a high-risk golden set for tax/compliance questions
Must cite approved sources for factual claims
Must refuse or escalate when the answer depends on missing user context, regulated advice boundaries, or unsupported claims
Must defend against prompt injection, PII leakage, and unauthorized retrieval across customer accounts

Available Resources

Approved corpora: TurboTax help center, IRS publications, Intuit policy docs, QuickBooks support articles, internal compliance-approved response templates
Structured metadata: doc version, jurisdiction, tax year, product surface, approval status, sensitivity label
Models: one high-quality LLM, one lower-cost LLM, embeddings model, reranker
Existing identity and authorization layer for QuickBooks/TurboTax users
Compliance team can label 1,000 high-risk prompts and review monthly regressions

Task

Design an eval-first LLM system for this assistant, including how you would measure factuality, refusal quality, prompt-injection robustness, and compliance before launch.
Propose the RAG architecture and prompt design needed to keep answers grounded in approved financial content while meeting latency and cost constraints.
Explain how you would mitigate the main deployment risks in a regulated environment: hallucinations, stale guidance, prompt injection, PII exposure, cross-tenant data leakage, and overconfident advice.
Define the online monitoring and rollout plan, including guardrails, escalation paths to human support or CPA/tax expert workflows, and rollback criteria.
Estimate cost/latency tradeoffs and identify where you would use smaller vs larger models.

Interview Guides

Context

Constraints

Available Resources

Task

Design a Compliant Finance Assistant

Context

Constraints

Available Resources

Task

Your Answer

Design a Compliant Finance Assistant

Context

Constraints

Available Resources

Task

Design a Compliant Finance Assistant

Context

Constraints

Available Resources

Task

Your Answer