AEGIS · AEGIS Shield
Artifact 2.3
AI Incident Response Runbook
The seven-phase playbook the organization runs the moment an AI incident is detected. Written to be usable at 2 a.m. — specific, named, and testable.
- Client
- [CLIENT NAME]
- Engagement
- [ENGAGEMENT ID]
- Version
- v1.0
- Issued
- 2026-05-18
Delivered by TechFides under the AEGIS Governance Operating Services engagement. This document is proprietary to the client named above. Redistribution beyond the engagement steering committee requires written consent.
Purpose
Intent — Make the incident response path short, obvious, and rehearsed. The cost of a slow response to an AI incident is almost always larger than the cost of the incident itself.
AI incidents are a new species with old shape. Like any incident they require detection, containment, notification, and learning — but they add specifics: prompt and output evidence, model version context, vendor subprocessor trails, and customer-facing output that may have already reached a user. This runbook adapts standard IR discipline to those specifics and reduces the response to seven named phases with named owners.
Severity Ladder
Intent — Four levels, triggers stated concretely, notification windows explicit.
SEV-1
Critical
Triggers
- Regulated data (PHI/PII/payment) confirmed exposed outside approved boundary.
- Customer-facing AI produces defamatory, harmful, or regulated-advice output in production.
- Trade-secret or proprietary algorithm confirmed sent to a non-approved AI tool.
Immediate response
Declare immediately. Executive Sponsor and General Counsel paged. Incident Commander stands up the incident room.
Notify
Executive Sponsor + General Counsel + CISO + AI Governance Lead + affected function lead — within 30 minutes of detection.
Communication SLA
Initial written update to Sponsor within 2 hours; hourly updates thereafter until containment.
SEV-2
Major
Triggers
- Confidential (P1) data sent to an approved AI tool but with configuration error (retention, tenant).
- Model version change in production workflow causing material output regression.
- Vendor subprocessor change that materially alters data residency without notification.
Immediate response
Incident Commander assigned. Parallel tracks for technical remediation and stakeholder communications.
Notify
CISO + AI Governance Lead + function lead within 1 hour; Executive Sponsor informed within 4 hours.
Communication SLA
Written update to stakeholders every 4 hours until resolution.
SEV-3
Moderate
Triggers
- Internal (P2) data sent to a consumer AI tool by an employee.
- Hallucination or error in AI-assisted deliverable caught pre-delivery but with downstream rework required.
- Governed workflow kill switch triggered but no customer impact.
Immediate response
On-call responder investigates, documents, and remediates within a named working window.
Notify
AI Governance Lead within 4 business hours; function lead same day.
Communication SLA
After-action note within 5 business days.
SEV-4
Near-miss / Control Signal
Triggers
- Shadow AI usage discovered and self-reported before data movement.
- Policy ambiguity prevented an employee from knowing whether a use was permitted.
- Tool monitoring surfaces unusual prompt patterns investigated and found benign.
Immediate response
Log, review, and feed into the quarterly Council review.
Notify
AI Governance Lead weekly digest.
Communication SLA
Addressed in the next Council meeting.
Incident Roles
Intent — Small team, clear seats. The role is held for the duration of the incident regardless of seniority.
- Incident Commander
- Runs the incident end-to-end. Authorized to make all tactical calls during the incident, including pulling production workflows offline. Typically the Governance Lead or a CISO-designated deputy.
- Technical Lead
- Owns the technical investigation, evidence preservation, and remediation execution. Reports into the Incident Commander.
- Communications Lead
- Owns internal status updates, drafts external communications for General Counsel review, and manages vendor communications.
- General Counsel
- Owns regulatory and contractual obligation analysis, external counsel engagement, and privilege assertions.
- Executive Sponsor
- Owns go/no-go for external communication, board notification, and public disclosure decisions.
- Scribe
- Maintains a contemporaneous incident timeline — time-stamped entries for every action, decision, and inbound signal. The scribe's log is the evidence of record.
Seven-Phase Response
Intent — The operational sequence. Each phase has a named owner and the specific steps that must happen inside it.
Phase 1 · Detect
Owner — Every Employee + On-call + CISO monitoring
- Any employee can raise an incident through the AI Incident hotline or the #ai-incidents channel — retaliation for good-faith reports is prohibited and tracked by the Council.
- Automated detection from tool telemetry, DLP, or SaaS discovery feeds the CISO on-call queue.
- Every raised signal receives an acknowledgment within 30 minutes, 24/7, and a severity rating within 2 hours.
Phase 2 · Triage
Owner — Incident Commander (Governance Lead or CISO delegate)
- Incident Commander opens an incident ticket with a unique ID, severity, and initial summary.
- Affected data classifications identified and scope bounded: what data, what tools, which users, since when.
- Decide: is this a SEV-1? If so, page the full leadership chain now — do not wait for more evidence.
Phase 3 · Contain
Owner — CISO + affected function lead
- Revoke tool access for the involved accounts if data movement is suspected in progress.
- Disable the offending integration, kill switch the workflow, or pin the model version as applicable.
- Preserve evidence: prompts, outputs, logs, audit trails. Do not 'clean up' the scene.
- Notify affected vendors if their infrastructure is implicated; request subprocessor cooperation.
Phase 4 · Investigate
Owner — Incident Commander + CISO + General Counsel
- Reconstruct the timeline: first occurrence, detection time, what the condition produced.
- Identify the root condition, not just the proximate trigger. Classify as policy gap, control failure, vendor failure, or human error.
- Assess legal exposure: regulatory notification obligations, client contractual notifications, insurance triggers.
- Draft the internal incident narrative and the external communication plan in parallel.
Phase 5 · Notify
Owner — General Counsel + Executive Sponsor
- Regulatory notifications in the required window (see §5 matrix). If uncertain, escalate to outside counsel.
- Customer / client notifications per contract. Standard practice: notify within 72 hours of confirmed material impact.
- Internal notification to affected employees, with guidance on what they can and cannot say publicly.
- Board notification if incident meets materiality threshold (see §6).
Phase 6 · Remediate
Owner — CISO + function lead + Governance Lead
- Close the immediate vulnerability: configuration fix, access revocation, policy clarification, tool removal.
- Verify closure with an independent check — the person who remediated cannot be the person who verifies.
- Track remediation tasks in the engagement's system of record; do not rely on chat-thread follow-through.
Phase 7 · After-Action
Owner — Governance Lead + Incident Commander
- Within 5 business days for SEV-1/SEV-2, 10 for SEV-3: deliver a written after-action report (template in §7).
- Identify the systemic change required: policy update, control redesign, training gap, vendor change.
- File at least one durable action item with a named owner and a due date.
- Present the after-action at the next Council meeting; reference it in the quarterly board pack.
Notification Matrix
Intent — Who must be told, by when, under which regime. When the runbook is activated, the Commander checks this matrix row by row.
| Trigger | Regime | Window | Owner |
|---|---|---|---|
| Personal data breach affecting EU data subjects | GDPR Art. 33 | 72 hours to supervisory authority | General Counsel + DPO |
| PHI breach affecting ≥500 individuals | HIPAA / HITECH | 60 days to HHS + individuals | General Counsel + Privacy Officer |
| California residents PII exposed | CCPA / CPRA | Without unreasonable delay — typically 30–45 days | General Counsel |
| Client-contracted notification | Contractual | Per MSA — typically 24–72 hours | Engagement lead + General Counsel |
| Material incident — public company | SEC (if applicable) | 4 business days from materiality determination | Executive Sponsor + Counsel + IR |
| Board notification — materiality met | Governance | Same day as materiality determination | Executive Sponsor |
| Vendor cooperation request | Vendor contract | Per contract — typically 24 hours | CISO |
| Insurance notification | Cyber / E&O policies | Per policy — typically 24–72 hours | General Counsel + Finance |
Materiality
Intent — The materiality definition the Executive Sponsor uses to trigger board notification. Stated up front so there is no argument during an incident.
An AI incident is material and requires board notification when any one of the following is true:
- Confirmed exposure of regulated data (PHI / PII / payment / privileged) affecting [N] or more individuals or records.
- Financial impact (remediation, fines, legal) projected at [$ THRESHOLD] or greater.
- Regulatory inquiry, investigation, or formal notice to a regulator.
- Customer-facing AI output that causes harm, discrimination, or demonstrable financial injury to a named party.
- Public disclosure required under any applicable law, contract, or market obligation.
After-Action Report Template
Intent — The structure every after-action follows. Filed in the engagement's system of record and reviewed at the next Council.
- Incident ID & Severity. Unique ID, severity ladder entry, and effective period (start → containment → closure).
- Summary. Three sentences: what happened, what data / systems / people were involved, how it was detected.
- Timeline.Time-stamped events from first occurrence to closure, drawn from the scribe's log.
- Root condition. Not just the proximate trigger. What in the system made this possible? Classify as policy, control, vendor, or human-error.
- Impact assessment. Data scope, customer impact, regulatory implications, financial exposure (range).
- Response assessment. What went well. What went slowly. What would have failed under a larger incident.
- Durable changes. Policy updates, control changes, training requirements, vendor renegotiations — each with a named owner and due date.
- Notifications made. Regulatory, contractual, insurance, board. Date / method / recipient / acknowledgment.
- Risk register updates. Risks added, risks re-scored, risks closed.
- Sign-off. Incident Commander, CISO, General Counsel, Executive Sponsor.
Exercising the Runbook
Intent — The runbook must be rehearsed. Unexercised runbooks are theater.
- Within 30 days of adoption — tabletop exercise on the highest-likelihood SEV-1 scenario. Attended by every named role. Scribe produces an after-action against this runbook as if it were a real incident.
- Semi-annually — tabletop on a rotating scenario (shadow AI exposure, model version regression, vendor subprocessor change, customer-facing output failure).
- Annually — a live simulation involving technical response, not just a paper exercise. Include a vendor notification dry-run.
- After every SEV-1 or SEV-2 — explicit review of whether this runbook held up. Revisions are tracked as version updates; the prior version is retained for audit.