Interview Guides

Deploy a Secure Enterprise AI Agent

Hard

Generative AI & LLMs

Context

FinSure, a B2B insurance platform, wants an internal agentic AI assistant for operations analysts. The assistant should answer policy and claims questions, summarize customer cases, and take limited actions such as creating follow-up tasks or drafting emails. It will operate over internal knowledge bases and a few approved tools, but must be safe for production use in a regulated environment.

Constraints

p95 end-to-end latency: 3,500ms for read-only requests; 6,000ms for action-taking requests
Cost ceiling: $35K/month at 200K requests/month
Hallucination ceiling: <2% materially unsupported statements on a 400-task golden set
Prompt-injection success rate: <1% on adversarial evals
No raw PII or policy documents may be sent to non-approved external systems
All tool actions must be auditable and require policy-aware authorization

Available Resources

1.2M internal documents: policy manuals, SOPs, claims playbooks, compliance memos, and ticket history
Approved models: one high-quality frontier model and one cheaper fast model
Internal tools: document search, customer profile lookup, task creation, email drafting, and case status APIs
Existing IAM, document ACLs, audit logging, and DLP/PII redaction services
Security team can label adversarial prompt-injection and data-exfiltration test cases

Deliverables

Design a production architecture for an agentic AI framework that supports retrieval, tool use, authorization, and auditability while preserving data privacy.
Define an eval-first plan: offline evaluation before launch and online monitoring after launch, including hallucination, prompt injection, privacy leakage, and task success.
Write the system prompt and tool-use policy that constrain the agent's behavior, including refusal and escalation rules.
Explain cost/latency tradeoffs, including when to use the cheaper model, when to avoid agent loops, and how to cap tool calls.
Identify the top failure modes in production and how you would detect and mitigate them.

Deploy a Secure Enterprise AI Agent

Hard

Generative AI & LLMs

Context

Constraints

p95 end-to-end latency: 3,500ms for read-only requests; 6,000ms for action-taking requests
Cost ceiling: $35K/month at 200K requests/month
Hallucination ceiling: <2% materially unsupported statements on a 400-task golden set
Prompt-injection success rate: <1% on adversarial evals
No raw PII or policy documents may be sent to non-approved external systems
All tool actions must be auditable and require policy-aware authorization

Available Resources

1.2M internal documents: policy manuals, SOPs, claims playbooks, compliance memos, and ticket history
Approved models: one high-quality frontier model and one cheaper fast model
Internal tools: document search, customer profile lookup, task creation, email drafting, and case status APIs
Existing IAM, document ACLs, audit logging, and DLP/PII redaction services
Security team can label adversarial prompt-injection and data-exfiltration test cases

Deliverables

Design a production architecture for an agentic AI framework that supports retrieval, tool use, authorization, and auditability while preserving data privacy.
Define an eval-first plan: offline evaluation before launch and online monitoring after launch, including hallucination, prompt injection, privacy leakage, and task success.
Write the system prompt and tool-use policy that constrain the agent's behavior, including refusal and escalation rules.
Explain cost/latency tradeoffs, including when to use the cheaper model, when to avoid agent loops, and how to cap tool calls.
Identify the top failure modes in production and how you would detect and mitigate them.

Your Answer

Deploy a Secure Enterprise AI Agent

Hard

Generative AI & LLMs

Context

Constraints

p95 end-to-end latency: 3,500ms for read-only requests; 6,000ms for action-taking requests
Cost ceiling: $35K/month at 200K requests/month
Hallucination ceiling: <2% materially unsupported statements on a 400-task golden set
Prompt-injection success rate: <1% on adversarial evals
No raw PII or policy documents may be sent to non-approved external systems
All tool actions must be auditable and require policy-aware authorization

Available Resources

1.2M internal documents: policy manuals, SOPs, claims playbooks, compliance memos, and ticket history
Approved models: one high-quality frontier model and one cheaper fast model
Internal tools: document search, customer profile lookup, task creation, email drafting, and case status APIs
Existing IAM, document ACLs, audit logging, and DLP/PII redaction services
Security team can label adversarial prompt-injection and data-exfiltration test cases

Deliverables

Design a production architecture for an agentic AI framework that supports retrieval, tool use, authorization, and auditability while preserving data privacy.
Define an eval-first plan: offline evaluation before launch and online monitoring after launch, including hallucination, prompt injection, privacy leakage, and task success.
Write the system prompt and tool-use policy that constrain the agent's behavior, including refusal and escalation rules.
Explain cost/latency tradeoffs, including when to use the cheaper model, when to avoid agent loops, and how to cap tool calls.
Identify the top failure modes in production and how you would detect and mitigate them.

Deploy a Secure Enterprise AI Agent

Hard

Generative AI & LLMs

Context

Constraints

p95 end-to-end latency: 3,500ms for read-only requests; 6,000ms for action-taking requests
Cost ceiling: $35K/month at 200K requests/month
Hallucination ceiling: <2% materially unsupported statements on a 400-task golden set
Prompt-injection success rate: <1% on adversarial evals
No raw PII or policy documents may be sent to non-approved external systems
All tool actions must be auditable and require policy-aware authorization

Available Resources

1.2M internal documents: policy manuals, SOPs, claims playbooks, compliance memos, and ticket history
Approved models: one high-quality frontier model and one cheaper fast model
Internal tools: document search, customer profile lookup, task creation, email drafting, and case status APIs
Existing IAM, document ACLs, audit logging, and DLP/PII redaction services
Security team can label adversarial prompt-injection and data-exfiltration test cases

Deliverables

Design a production architecture for an agentic AI framework that supports retrieval, tool use, authorization, and auditability while preserving data privacy.
Define an eval-first plan: offline evaluation before launch and online monitoring after launch, including hallucination, prompt injection, privacy leakage, and task success.
Write the system prompt and tool-use policy that constrain the agent's behavior, including refusal and escalation rules.
Explain cost/latency tradeoffs, including when to use the cheaper model, when to avoid agent loops, and how to cap tool calls.
Identify the top failure modes in production and how you would detect and mitigate them.