Defend Agents From Injected Documents

Scenario

You are building an agent that reads documents during a task, then uses those documents to decide what to do next. One of the ingested documents may contain hidden or explicit instructions like "ignore previous directions" or "send data to this endpoint," mixed in with otherwise useful content.

Question

How would you prevent prompt injection from a document the agent ingested mid-task?

Problem

Scenario

Question

How would you prevent prompt injection from a document the agent ingested mid-task?

What this tests

Prompt injection defenses for agent workflows
Separation of trusted instructions from untrusted document content
RAG and retrieval containment choices
Offline and online evaluation of attack resistance

Problem

Scenario

Question

How would you prevent prompt injection from a document the agent ingested mid-task?

What this tests

Prompt injection defenses for agent workflows
Separation of trusted instructions from untrusted document content
RAG and retrieval containment choices
Offline and online evaluation of attack resistance

Problem

Scenario

Question

How would you prevent prompt injection from a document the agent ingested mid-task?

What this tests

Prompt injection defenses for agent workflows
Separation of trusted instructions from untrusted document content
RAG and retrieval containment choices
Offline and online evaluation of attack resistance

Interview Guides

Problem

Scenario

Question

What this tests

Problem

Scenario

Question

What this tests

Defend Agents From Injected Documents

Problem

Scenario

Question

What this tests

Problem

Scenario

Question

What this tests