Guide

Agent Incident Review

A structured process for reviewing failed AI agent runs and turning traces into controls, fixtures, owners, follow-up tests, and safer workflows.

Agent incidents should be reviewed from evidence, not from the final answer alone. The useful record includes the input, plan, retrieved sources, tool calls, approvals, errors, retries, and final output. Without that trace, the team can only guess whether the agent misunderstood, used the wrong source, exceeded permission, or hid uncertainty.

The goal of an incident review is a control change. A better prompt may be part of the fix, but it is rarely enough. Good reviews turn the failure into a fixture, update a permission or validation rule, and add an owner for follow-up. The process below pairs with the agent observability guide and the AI agent failure modes checklist.

The problem with informal reviews

Informal reviews usually stop at “the model got it wrong.” That is not actionable. The model may have followed bad instructions, used stale retrieval, accepted malicious source text, called the right tool with bad arguments, skipped a no-answer path, or performed an action it should not have been allowed to perform.

If the review does not identify the failed control, the same class of incident will return under a different prompt. Agent systems need post-incident learning that looks more like software reliability than prompt tuning.

Start with a clear incident record

Create one incident document per failed run or cluster of related runs. Record the timestamp, workflow, owner, trigger, user impact, affected data or system, current status, and whether any production action needs rollback. Link the trace, artifacts, user report, and any human edits made after the run.

Then classify severity. A harmless wrong draft is different from a customer-visible email, a privacy exposure, a destructive tool call, or an unsupported recommendation used by a decision-maker. Severity should be based on impact and recoverability, not on how surprising the output looked.

Reconstruct the run

Read the trace in sequence. What did the user ask? How did the agent normalize the task? Which sources were retrieved? Were they current, relevant, and allowed? Which tools were called? Were arguments validated? Did the agent retry? Did it encounter errors? Did a human approve the action? Did the final answer cite evidence?

Separate observed facts from guesses. If the logs do not show why the agent chose a tool, write “not observable” instead of inventing a reason. That gap is itself a finding.

Identify the failed control

Most incidents map to one or more control failures. Retrieval control failed when the agent used irrelevant, stale, or incomplete sources. Permission control failed when the agent could perform an action that should have required approval. Validation failed when a malformed or broad tool call executed. No-answer control failed when the system guessed instead of escalating. Observability failed when reviewers could not reconstruct the run.

Use the agent permission design guide to decide whether the action class was too broad. Use the RAG no-answer testing approach when the incident involved missing evidence.

Write the corrective action

Every incident needs at least one concrete corrective action. Examples include adding a fixture, blocking a tool argument pattern, narrowing retrieval sources, requiring approval for a permission class, adding a no-answer rule, improving citations, adding trace fields, or changing the final response schema.

Assign an owner and due date. If the fix cannot be validated, it is not ready. The regression fixture should reproduce the original failure or a smaller version of it. Add the fixture to the same evaluation surface used for launch testing, such as the LLM evaluation framework.

Verification checklist

An incident review is complete only when the team can answer six questions. What happened? Who or what was affected? Which control failed? What changed? How will the same class of failure be detected? Who will verify the fix?

After the fix, rerun the incident fixture and a few neighboring fixtures. Neighboring tests matter because a narrow prompt patch can fix one example while breaking related behavior.

Frequently asked questions

What should an AI agent incident review produce?

An AI agent incident review should produce a failure class, root cause, changed control, regression fixture, owner, and follow-up review date.

Should incident reviews blame the model?

No. The review should identify the system control that failed, such as retrieval, permissions, validation, logging, approval, or escalation.

Next step

Use the incident template when an agent run fails in review or production. The review is successful when it improves the system’s controls and gives the team a repeatable test, not when it produces a longer explanation of the mistake.

Reusable resource: Download incident template

Agent Incident Review

The problem with informal reviews

Start with a clear incident record

Reconstruct the run

Identify the failed control

Write the corrective action

Verification checklist

Frequently asked questions

What should an AI agent incident review produce?

Should incident reviews blame the model?

Next step

Related content

AI Agent Failure Modes

Agent Observability Guide

Agent Permission Design

Agent Reliability Scorecard