Automate Engineering Workflow Triage

Context

OrbitOps wants an internal AI assistant that helps engineers handle routine workflow tasks: triaging CI failures, summarizing incident context, drafting Jira tickets, and answering questions about runbooks and service ownership. The goal is to reduce interrupt load on senior engineers without creating unsafe or misleading automation.

Constraints

p95 latency: 3,000ms for a single-turn request
Cost ceiling: $12K/month at 200K requests/month
Hallucination ceiling: <2% on high-risk actions (ownership, runbook steps, incident status)
Automation policy: the assistant may draft or recommend actions, but cannot execute production changes
Safety: must resist prompt injection from tickets, logs, or docs; must not reveal secrets or hidden system instructions

Available Resources

120K internal documents: runbooks, postmortems, service catalog entries, RFCs, and on-call guides
Tool APIs: Jira (create draft ticket), PagerDuty (read incidents), CI provider (read build status), service catalog lookup, and internal search
Approved models: a fast low-cost model for routing and a stronger model for final responses
500 historical workflow examples with human-written resolutions

Task

Design an LLM-powered workflow assistant that decides when to answer directly, when to retrieve documentation, and when to call read-only tools before producing a response.
Define an evaluation plan first: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics after launch.
Write a system prompt that enforces grounded behavior, tool-use boundaries, refusal behavior, and structured outputs for downstream systems.
Propose the architecture, including retrieval, agent orchestration, fallback behavior, and how you would stay within the latency and cost budget.
Identify the major failure modes and mitigations, especially around hallucinated remediation steps, injected instructions in logs, and stale documentation.

Problem

Context

Constraints

p95 latency: 3,000ms for a single-turn request
Cost ceiling: $12K/month at 200K requests/month
Hallucination ceiling: <2% on high-risk actions (ownership, runbook steps, incident status)
Automation policy: the assistant may draft or recommend actions, but cannot execute production changes
Safety: must resist prompt injection from tickets, logs, or docs; must not reveal secrets or hidden system instructions

Available Resources

120K internal documents: runbooks, postmortems, service catalog entries, RFCs, and on-call guides
Tool APIs: Jira (create draft ticket), PagerDuty (read incidents), CI provider (read build status), service catalog lookup, and internal search
Approved models: a fast low-cost model for routing and a stronger model for final responses
500 historical workflow examples with human-written resolutions

Task

Design an LLM-powered workflow assistant that decides when to answer directly, when to retrieve documentation, and when to call read-only tools before producing a response.
Define an evaluation plan first: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics after launch.
Write a system prompt that enforces grounded behavior, tool-use boundaries, refusal behavior, and structured outputs for downstream systems.
Propose the architecture, including retrieval, agent orchestration, fallback behavior, and how you would stay within the latency and cost budget.
Identify the major failure modes and mitigations, especially around hallucinated remediation steps, injected instructions in logs, and stale documentation.

Problem

Context

Constraints

p95 latency: 3,000ms for a single-turn request
Cost ceiling: $12K/month at 200K requests/month
Hallucination ceiling: <2% on high-risk actions (ownership, runbook steps, incident status)
Automation policy: the assistant may draft or recommend actions, but cannot execute production changes
Safety: must resist prompt injection from tickets, logs, or docs; must not reveal secrets or hidden system instructions

Available Resources

120K internal documents: runbooks, postmortems, service catalog entries, RFCs, and on-call guides
Tool APIs: Jira (create draft ticket), PagerDuty (read incidents), CI provider (read build status), service catalog lookup, and internal search
Approved models: a fast low-cost model for routing and a stronger model for final responses
500 historical workflow examples with human-written resolutions

Task

Design an LLM-powered workflow assistant that decides when to answer directly, when to retrieve documentation, and when to call read-only tools before producing a response.
Define an evaluation plan first: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics after launch.
Write a system prompt that enforces grounded behavior, tool-use boundaries, refusal behavior, and structured outputs for downstream systems.
Propose the architecture, including retrieval, agent orchestration, fallback behavior, and how you would stay within the latency and cost budget.
Identify the major failure modes and mitigations, especially around hallucinated remediation steps, injected instructions in logs, and stale documentation.

Problem

Context

Constraints

p95 latency: 3,000ms for a single-turn request
Cost ceiling: $12K/month at 200K requests/month
Hallucination ceiling: <2% on high-risk actions (ownership, runbook steps, incident status)
Automation policy: the assistant may draft or recommend actions, but cannot execute production changes
Safety: must resist prompt injection from tickets, logs, or docs; must not reveal secrets or hidden system instructions

Available Resources

120K internal documents: runbooks, postmortems, service catalog entries, RFCs, and on-call guides
Tool APIs: Jira (create draft ticket), PagerDuty (read incidents), CI provider (read build status), service catalog lookup, and internal search
Approved models: a fast low-cost model for routing and a stronger model for final responses
500 historical workflow examples with human-written resolutions

Task

Design an LLM-powered workflow assistant that decides when to answer directly, when to retrieve documentation, and when to call read-only tools before producing a response.
Define an evaluation plan first: offline golden sets, adversarial prompt-injection tests, hallucination measurement, and online success metrics after launch.
Write a system prompt that enforces grounded behavior, tool-use boundaries, refusal behavior, and structured outputs for downstream systems.
Propose the architecture, including retrieval, agent orchestration, fallback behavior, and how you would stay within the latency and cost budget.
Identify the major failure modes and mitigations, especially around hallucinated remediation steps, injected instructions in logs, and stale documentation.

Interview Guides

Problem

Context

Constraints

Available Resources

Task

Problem

Context

Constraints

Available Resources

Task

Automate Engineering Workflow Triage

Problem

Context

Constraints

Available Resources

Task

Problem

Context

Constraints

Available Resources

Task