Design an Agentic SDLC Assistant

Context

Accenture Federal Services wants to extend AFS AI Refinery with an internal agentic assistant that helps software teams turn requirements into implementation plans, draft code changes, generate tests, and summarize pull-request risk. The goal is to improve developer throughput without allowing the agent to invent requirements, misuse tools, or leak sensitive program data.

Constraints

p95 end-to-end latency: < 8 seconds for a single-turn task such as test generation or PR review summary
Cost ceiling: < $0.18 per task and < $45K/month at 250K tasks/month
Hallucination ceiling: < 2% on a labeled offline set for requirement-grounded outputs
Prompt injection success rate from tool outputs or retrieved artifacts: < 0.5%
Must preserve human approval before any code merge, ticket update, or deployment action
All outputs must remain within approved AFS environments and respect repo/document access controls

Available Resources

120K internal artifacts: Jira-style requirements, ADRs, API specs, design docs, code review comments, test plans, and runbooks
Read-only tools for repository search, issue lookup, CI test results, static-analysis findings, and document retrieval inside AFS AI Refinery
Approved LLMs: one higher-quality model for planning/review and one lower-cost model for simple transformations
40 senior engineers and tech leads available to label a golden set of tasks and failure cases

Task

Design an agentic workflow for standard software development tasks (requirements analysis, code assistance, test generation, PR review) and explain where autonomy should stop and human approval should begin.
Define the evaluation plan first: offline golden-set metrics, adversarial testing for prompt injection, and online success/guardrail metrics.
Write a system prompt that constrains tool use, grounded reasoning, and refusal behavior when requirements or evidence are insufficient.
Propose the architecture, including retrieval/tool orchestration, model routing, and controls for latency and cost.
Identify major failure modes and mitigations, especially hallucinated requirements, unsafe code suggestions, prompt injection from artifacts, and permission boundary violations.

Context

Constraints

p95 end-to-end latency: < 8 seconds for a single-turn task such as test generation or PR review summary
Cost ceiling: < $0.18 per task and < $45K/month at 250K tasks/month
Hallucination ceiling: < 2% on a labeled offline set for requirement-grounded outputs
Prompt injection success rate from tool outputs or retrieved artifacts: < 0.5%
Must preserve human approval before any code merge, ticket update, or deployment action
All outputs must remain within approved AFS environments and respect repo/document access controls

Available Resources

120K internal artifacts: Jira-style requirements, ADRs, API specs, design docs, code review comments, test plans, and runbooks
Read-only tools for repository search, issue lookup, CI test results, static-analysis findings, and document retrieval inside AFS AI Refinery
Approved LLMs: one higher-quality model for planning/review and one lower-cost model for simple transformations
40 senior engineers and tech leads available to label a golden set of tasks and failure cases

Task

Design an agentic workflow for standard software development tasks (requirements analysis, code assistance, test generation, PR review) and explain where autonomy should stop and human approval should begin.
Define the evaluation plan first: offline golden-set metrics, adversarial testing for prompt injection, and online success/guardrail metrics.
Write a system prompt that constrains tool use, grounded reasoning, and refusal behavior when requirements or evidence are insufficient.
Propose the architecture, including retrieval/tool orchestration, model routing, and controls for latency and cost.
Identify major failure modes and mitigations, especially hallucinated requirements, unsafe code suggestions, prompt injection from artifacts, and permission boundary violations.

Context

Constraints

p95 end-to-end latency: < 8 seconds for a single-turn task such as test generation or PR review summary
Cost ceiling: < $0.18 per task and < $45K/month at 250K tasks/month
Hallucination ceiling: < 2% on a labeled offline set for requirement-grounded outputs
Prompt injection success rate from tool outputs or retrieved artifacts: < 0.5%
Must preserve human approval before any code merge, ticket update, or deployment action
All outputs must remain within approved AFS environments and respect repo/document access controls

Available Resources

120K internal artifacts: Jira-style requirements, ADRs, API specs, design docs, code review comments, test plans, and runbooks
Read-only tools for repository search, issue lookup, CI test results, static-analysis findings, and document retrieval inside AFS AI Refinery
Approved LLMs: one higher-quality model for planning/review and one lower-cost model for simple transformations
40 senior engineers and tech leads available to label a golden set of tasks and failure cases

Task

Design an agentic workflow for standard software development tasks (requirements analysis, code assistance, test generation, PR review) and explain where autonomy should stop and human approval should begin.
Define the evaluation plan first: offline golden-set metrics, adversarial testing for prompt injection, and online success/guardrail metrics.
Write a system prompt that constrains tool use, grounded reasoning, and refusal behavior when requirements or evidence are insufficient.
Propose the architecture, including retrieval/tool orchestration, model routing, and controls for latency and cost.
Identify major failure modes and mitigations, especially hallucinated requirements, unsafe code suggestions, prompt injection from artifacts, and permission boundary violations.

Context

Constraints

p95 end-to-end latency: < 8 seconds for a single-turn task such as test generation or PR review summary
Cost ceiling: < $0.18 per task and < $45K/month at 250K tasks/month
Hallucination ceiling: < 2% on a labeled offline set for requirement-grounded outputs
Prompt injection success rate from tool outputs or retrieved artifacts: < 0.5%
Must preserve human approval before any code merge, ticket update, or deployment action
All outputs must remain within approved AFS environments and respect repo/document access controls

Available Resources

120K internal artifacts: Jira-style requirements, ADRs, API specs, design docs, code review comments, test plans, and runbooks
Read-only tools for repository search, issue lookup, CI test results, static-analysis findings, and document retrieval inside AFS AI Refinery
Approved LLMs: one higher-quality model for planning/review and one lower-cost model for simple transformations
40 senior engineers and tech leads available to label a golden set of tasks and failure cases

Task

Design an agentic workflow for standard software development tasks (requirements analysis, code assistance, test generation, PR review) and explain where autonomy should stop and human approval should begin.
Define the evaluation plan first: offline golden-set metrics, adversarial testing for prompt injection, and online success/guardrail metrics.
Write a system prompt that constrains tool use, grounded reasoning, and refusal behavior when requirements or evidence are insufficient.
Propose the architecture, including retrieval/tool orchestration, model routing, and controls for latency and cost.
Identify major failure modes and mitigations, especially hallucinated requirements, unsafe code suggestions, prompt injection from artifacts, and permission boundary violations.

Interview Guides

Context

Constraints

Available Resources

Task

Design an Agentic SDLC Assistant

Context

Constraints

Available Resources

Task

Your Answer

Design an Agentic SDLC Assistant

Context

Constraints

Available Resources

Task

Design an Agentic SDLC Assistant

Context

Constraints

Available Resources

Task

Your Answer