Assess AI Risks in Engineering Workflow

Context

ForgeFlow sells an AI assistant for software teams that can answer questions about internal code/docs, draft pull request summaries, suggest fixes for CI failures, and optionally open tickets or propose code changes. The customer is a regulated enterprise and wants to deploy it across their engineering workflow without creating security, reliability, or compliance incidents.

Constraints

p95 latency: 3,000ms for read-only tasks; up to 8,000ms for actions that call tools
Cost ceiling: $35K/month at 40K daily active requests
Hallucination ceiling: <2% on high-risk tasks (code/config/security guidance), measured on a labeled golden set
Prompt-injection success rate: <0.5% on adversarial evals
Any action that changes state must be auditable, permission-scoped, and human-approved by default
The system must avoid leaking secrets, proprietary code, or cross-team data

Available Resources

2M internal artifacts: code files, PRs, runbooks, incident docs, RFCs, CI logs, and issue tracker tickets
Read-only tools for GitHub, Jira, CI, and internal docs; write tools exist but can be gated behind approval
Approved models: a fast small model, a stronger general-purpose model, and an embedding model
Security team can provide 200 adversarial prompt-injection examples and 100 secret-leakage test cases
25 staff engineers can label a 600-task golden set

Task

Design a safe LLM architecture for this product, including which capabilities should be read-only, which can use tools, and where human approval is required.
Define an eval-first deployment plan: offline evals, online metrics, launch gates, and rollback criteria.
Specify how you would reduce hallucination, prompt injection, and data leakage risk while staying within the latency and cost limits.
Write a production-grade system prompt for the assistant that handles grounded answering, tool use, and refusal behavior.
Estimate cost/latency and identify the top failure modes, detection signals, and mitigations.

Context

Constraints

p95 latency: 3,000ms for read-only tasks; up to 8,000ms for actions that call tools
Cost ceiling: $35K/month at 40K daily active requests
Hallucination ceiling: <2% on high-risk tasks (code/config/security guidance), measured on a labeled golden set
Prompt-injection success rate: <0.5% on adversarial evals
Any action that changes state must be auditable, permission-scoped, and human-approved by default
The system must avoid leaking secrets, proprietary code, or cross-team data

Available Resources

2M internal artifacts: code files, PRs, runbooks, incident docs, RFCs, CI logs, and issue tracker tickets
Read-only tools for GitHub, Jira, CI, and internal docs; write tools exist but can be gated behind approval
Approved models: a fast small model, a stronger general-purpose model, and an embedding model
Security team can provide 200 adversarial prompt-injection examples and 100 secret-leakage test cases
25 staff engineers can label a 600-task golden set

Task

Design a safe LLM architecture for this product, including which capabilities should be read-only, which can use tools, and where human approval is required.
Define an eval-first deployment plan: offline evals, online metrics, launch gates, and rollback criteria.
Specify how you would reduce hallucination, prompt injection, and data leakage risk while staying within the latency and cost limits.
Write a production-grade system prompt for the assistant that handles grounded answering, tool use, and refusal behavior.
Estimate cost/latency and identify the top failure modes, detection signals, and mitigations.

Context

Constraints

p95 latency: 3,000ms for read-only tasks; up to 8,000ms for actions that call tools
Cost ceiling: $35K/month at 40K daily active requests
Hallucination ceiling: <2% on high-risk tasks (code/config/security guidance), measured on a labeled golden set
Prompt-injection success rate: <0.5% on adversarial evals
Any action that changes state must be auditable, permission-scoped, and human-approved by default
The system must avoid leaking secrets, proprietary code, or cross-team data

Available Resources

2M internal artifacts: code files, PRs, runbooks, incident docs, RFCs, CI logs, and issue tracker tickets
Read-only tools for GitHub, Jira, CI, and internal docs; write tools exist but can be gated behind approval
Approved models: a fast small model, a stronger general-purpose model, and an embedding model
Security team can provide 200 adversarial prompt-injection examples and 100 secret-leakage test cases
25 staff engineers can label a 600-task golden set

Task

Design a safe LLM architecture for this product, including which capabilities should be read-only, which can use tools, and where human approval is required.
Define an eval-first deployment plan: offline evals, online metrics, launch gates, and rollback criteria.
Specify how you would reduce hallucination, prompt injection, and data leakage risk while staying within the latency and cost limits.
Write a production-grade system prompt for the assistant that handles grounded answering, tool use, and refusal behavior.
Estimate cost/latency and identify the top failure modes, detection signals, and mitigations.

Context

Constraints

p95 latency: 3,000ms for read-only tasks; up to 8,000ms for actions that call tools
Cost ceiling: $35K/month at 40K daily active requests
Hallucination ceiling: <2% on high-risk tasks (code/config/security guidance), measured on a labeled golden set
Prompt-injection success rate: <0.5% on adversarial evals
Any action that changes state must be auditable, permission-scoped, and human-approved by default
The system must avoid leaking secrets, proprietary code, or cross-team data

Available Resources

2M internal artifacts: code files, PRs, runbooks, incident docs, RFCs, CI logs, and issue tracker tickets
Read-only tools for GitHub, Jira, CI, and internal docs; write tools exist but can be gated behind approval
Approved models: a fast small model, a stronger general-purpose model, and an embedding model
Security team can provide 200 adversarial prompt-injection examples and 100 secret-leakage test cases
25 staff engineers can label a 600-task golden set

Task

Design a safe LLM architecture for this product, including which capabilities should be read-only, which can use tools, and where human approval is required.
Define an eval-first deployment plan: offline evals, online metrics, launch gates, and rollback criteria.
Specify how you would reduce hallucination, prompt injection, and data leakage risk while staying within the latency and cost limits.
Write a production-grade system prompt for the assistant that handles grounded answering, tool use, and refusal behavior.
Estimate cost/latency and identify the top failure modes, detection signals, and mitigations.

Interview Guides

Context

Constraints

Available Resources

Task

Assess AI Risks in Engineering Workflow

Context

Constraints

Available Resources

Task

Your Answer

Assess AI Risks in Engineering Workflow

Context

Constraints

Available Resources

Task

Assess AI Risks in Engineering Workflow

Context

Constraints

Available Resources

Task

Your Answer