The AI Incident Response Playbook: What Happens in the First 72 Hours After AI Goes Wrong

Brandon Sneider | March 2026

Executive Summary

Documented AI safety incidents surged 56.4% in a single year — from 149 in 2023 to 233 in 2024 (Stanford AI Index, April 2025) — and the companies that handled them well shared one trait: they had a playbook before the incident happened.
AI incidents differ from traditional IT failures because they can cause harm through interaction rather than system penetration, the outputs are probabilistic rather than deterministic, and the regulatory notification obligations span multiple state and federal frameworks with conflicting timelines.
Three frameworks now provide the operational foundation: NIST SP 800-61r3 adapted for AI, the CoSAI AI Incident Response Framework v1.0 (November 2025), and the OWASP GenAI Incident Response Guide 1.0 (July 2025). None were designed for a 200-500 person company without a dedicated incident response team.
The regulatory clock starts ticking immediately: EU AI Act requires notification within 2-15 days depending on severity; Colorado’s AI Act requires 90-day attorney general notification for algorithmic discrimination; Texas RAIGA gives 60 days to cure violations before penalties accrue.
A mid-market company can build a functional AI incident response capability in 2-3 weeks for $5K-$15K, primarily through adapting existing cyber incident response plans and running one AI-specific tabletop exercise.

Why AI Incidents Are Different — and Why Your Cyber Playbook Is Not Enough

Traditional incident response assumes a breach: someone got in who should not have. AI incidents break this model in three ways.

The system can cause harm while functioning exactly as designed. Air Canada’s chatbot fabricated a bereavement fare policy in February 2024. The system was not hacked, not misconfigured, not malfunctioning. It was doing precisely what large language models do — generating plausible-sounding text. The British Columbia Civil Resolution Tribunal ruled Air Canada fully liable, establishing the precedent that companies cannot disclaim responsibility for their AI’s statements (Moffatt v. Air Canada, CRT, February 2024). The total cost was modest — CAD $812 in damages — but the legal precedent was not.

The blast radius is reputational before it is technical. When Apple disabled its AI-generated news feature in January 2025 after it fabricated a story claiming Rafael Nadal had come out as gay, the damage was not a data breach. It was a trust breach. McDonald’s AI hiring platform McHire exposed 64 million job application records through default credentials (“123456/123456”) with no multi-factor authentication — a traditional security failure amplified by the AI system’s scale (ISACA, December 2025).

The failure modes are novel. Deloitte submitted a 237-page AI-assisted report to the Australian government in July 2025 containing up to 20 fabricated sources — fictional academic papers, a fabricated court case, a made-up judicial quote. The firm quietly published a corrected version in September 2025 with a new disclosure that Azure OpenAI GPT-4o had been used. The cost: AU$440,000 in direct exposure and incalculable reputational damage (Medium analysis, January 2026).

The Stanford AI Index documents the acceleration: documented AI safety incidents rose from 149 to 233 between 2023 and 2024 — a 56.4% increase. ISACA’s 2025 post-mortem concludes that “the biggest AI failures of 2025 weren’t technical — they were organizational: weak controls, unclear ownership and misplaced trust.”

The Five Categories of AI Incidents

Not every AI failure requires the same response. The IAPP’s framework (January 2026) identifies five distinct incident categories, each with different containment and notification requirements:

Category	Example	Severity Driver	Notification Trigger
Hallucination / Incorrect Output	Fabricated legal citations, wrong customer advice	Customer reliance on output	Client notification within 24-72 hours
Algorithmic Discrimination	Biased hiring, lending, or insurance decisions	Protected class impact	State AG notification (CO: 90 days, TX: 60-day cure)
Data Exposure	AI system reveals training data or PII	Privacy law obligations	Standard breach notification timelines (varies by state)
Adversarial Attack	Prompt injection, model extraction, jailbreak	Security compromise	CISA reporting if critical infrastructure; standard IR
Operational Failure	Model drift, performance degradation	Business continuity	Internal escalation; client SLA triggers

The category determines the clock. A hallucination that reaches a customer requires a different response than an adversarial attack on model integrity. The incident commander’s first job is classification.

The 72-Hour Playbook for a 200-500 Person Company

This playbook assumes no dedicated incident response team, no CISO on staff full-time, and AI deployed across 2-4 business functions. It adapts the CoSAI framework (November 2025) and OWASP GenAI IR Guide (July 2025) for mid-market reality.

Hour 0-1: Detection and Classification

Who acts: The person who discovered the incident (employee, customer, vendor).

Immediate steps:

Document what happened — screenshot, log extract, customer complaint. Preserve the AI output verbatim.
Classify the incident using the five-category table above. This determines everything else.
Notify the incident commander (at a 200-500 person company, this is typically the CIO, VP of Operations, or the person who owns the AI program).
The incident commander convenes the response team within 60 minutes.

The response team at mid-market scale:

Role	Who Fills It	Why They Are There
Incident Commander	CIO or AI program owner	Coordinates response, owns timeline
Legal	GC or outside counsel	Regulatory notification, liability assessment
Technical Lead	IT director or vendor contact	System investigation, containment options
Communications	Head of Marketing or CEO	Customer notification, media response
Business Owner	Department head where AI is deployed	Impact assessment, workaround activation

At a 500-person company, this is probably five people, not five teams. That is fine. The structure matters more than the headcount.

Hour 1-4: Containment

The containment decision follows a severity escalation adapted from the GLACIS framework (2026):

Action	When to Use	Reversibility
Monitor only	Low-severity, no customer impact yet	Fully reversible
Rate limiting	Suspicious patterns, possible ongoing issue	Fully reversible
Shadow mode	Uncertain severity; run AI in parallel without production impact	Fully reversible
Feature flag disable	Confirmed issue in one AI feature	Reversible
Model rollback	Confirmed bad output from current model version	Reversible (if version control exists)
Full service shutdown	Customer safety risk, regulatory exposure, or reputational crisis	Reversible but operationally costly

The critical question: Can the AI produce more bad outputs while you investigate? If yes, contain first, investigate second. The Deloitte Australia incident worsened because the firm published the flawed report before discovering the fabricated sources. Containment prevents the scope from expanding.

Evidence preservation: Before changing anything, capture model version, input data, system logs, and the AI output that triggered the incident. This evidence is required for regulatory response and may be needed in litigation. The OWASP guide emphasizes that AI evidence is more perishable than traditional digital evidence — model state, context windows, and prompt histories can be overwritten.

Hour 4-24: Investigation and Assessment

Technical investigation:

What input produced the bad output? Was this a one-time failure or a systemic pattern?
Review logged outputs for similar failures. The GenAI-IRF academic framework (Tuscano & Disso, MDPI, January 2026) identifies six recurrent incident archetypes — knowing which one applies narrows the investigation.
If using a third-party model (OpenAI, Anthropic, Google), engage the vendor’s incident response channel. Most enterprise agreements include incident support.
Check whether model drift, data poisoning, or prompt injection contributed.

Impact assessment:

How many customers, decisions, or outputs were affected?
Were any decisions in regulated domains (hiring, lending, insurance, healthcare)?
Does the incident trigger state-specific notification obligations?
What is the financial exposure — direct damages, regulatory penalties, litigation risk?

Regulatory clock check:

Jurisdiction	Obligation	Timeline
EU AI Act (Art. 73)	Report serious incidents to market surveillance authority	2 days (widespread/critical), 10 days (death), 15 days (other serious)
Colorado AI Act	Report algorithmic discrimination to AG and all deployers	90 days from discovery (enforcement begins June 30, 2026)
Texas RAIGA	Cure violations after AG notice	60 days to cure; $10K-$12K per curable violation, $80K-$200K per uncurable
State breach notification	Report data exposure involving PII	Varies: most states require 30-60 days

Hour 24-72: Response and Communication

Internal communication:

Brief the CEO and board chair if the incident involves customer harm, regulatory exposure, or potential media coverage.
Document the incident timeline, containment actions, and preliminary root cause.

External communication (if required):

Affected customers: What happened, what decisions may have been affected, what remediation is available. Direct and specific. Do not minimize.
Regulators: Follow the jurisdiction-specific template. Colorado requires information about the system, the discrimination, continued risks, and remediation steps.
Media (if the incident becomes public): Prepared statement acknowledging the issue, describing containment, and committing to remediation. The Air Canada approach — arguing the chatbot was a separate legal entity — is a model of what not to do.

The stakeholder communication matrix (adapted from GLACIS, 2026):

Stakeholder	Timing	Content
Internal leadership	Immediately	Summary, business impact, containment status
Legal/Compliance	Within 1 hour	Technical details, regulatory exposure assessment
Affected customers	24-72 hours	What happened, what decisions may need review, remediation
Regulators	Per jurisdiction	Nature, severity, corrective measures, ongoing risk
Board	48-72 hours	Incident summary, liability assessment, remediation plan

Building the Playbook Before You Need It

The companies that respond well to AI incidents are the ones that prepared before the incident occurred. Organizations with tested incident response plans save an average of $473,706 in breach costs compared to those without formal IR capabilities (IBM/Ponemon, 2024).

The preparation checklist for a 200-500 person company:

AI system inventory. You cannot respond to incidents in systems you do not know exist. Document every AI tool — vendor, purpose, data inputs, who uses it, what outputs reach customers or regulators.
Baseline documentation. For each AI system, define what “correct” looks like. What accuracy rate is acceptable? What outputs would constitute a failure? Without baselines, you cannot detect degradation.
Incident classification criteria. Adopt the five-category framework above and define severity thresholds for each. A hallucination in an internal summary has different severity than a hallucination in a client deliverable.
Response team roster. Name the five roles. Confirm each person knows they are on the team. Document their after-hours contact information.
Vendor escalation paths. For every AI vendor, document the incident reporting channel, SLA for response, and the contract terms governing liability for AI-generated errors.
Notification decision tree. Map each incident category to the regulatory notification obligations that apply to your company based on where you operate and which states’ laws reach your customers.
Tabletop exercise. Run one AI-specific scenario — a hallucinated output that reaches a customer in a regulated context is a good first scenario. One hour, all five response team members, no technology required. This single exercise reveals 80% of the gaps in your plan.

Cost estimate: $5K-$15K for a mid-market company, primarily in legal review of notification obligations ($3K-$8K) and 10-20 hours of internal staff time for documentation and the tabletop exercise. Companies with existing cyber incident response plans can adapt them for AI in 2-3 weeks. Companies starting from scratch should budget 4-6 weeks.

What Separates Companies That Recover from Those That Do Not

ISACA’s analysis of top 2025 incidents reveals a pattern. The companies that suffered lasting damage shared three traits: unclear ownership of AI systems, no pre-defined response process, and organizational instinct to minimize rather than address.

The companies that recovered quickly did three things differently:

They contained before they communicated. Stop the AI from producing more bad output before drafting the press release. The McDonald’s AI drive-thru debacle went viral because the system kept running while producing absurd orders. Containment is always step one.

They took ownership immediately. Air Canada tried to argue its chatbot was a separate legal entity. The tribunal rejected this. The precedent is clear: the company that deploys the AI owns every output. Accountability research across the regulatory landscape confirms that “the AI did it” is not a defense in any jurisdiction.

They used the incident to improve. The GenAI-IRF framework (Tuscano & Disso, January 2026) found that organizations conducting structured post-incident reviews with assigned, time-bound action items showed improved incident resolution times in subsequent events. The post-mortem is not bureaucracy — it is the mechanism that prevents the same failure twice.

Key Data Points

233 documented AI safety incidents in 2024, up 56.4% from 149 in 2023 (Stanford AI Index, April 2025)
$473,706 average breach cost savings for organizations with tested incident response plans (IBM/Ponemon, 2024)
90 days Colorado AI Act notification deadline to AG for algorithmic discrimination (enforcement begins June 30, 2026)
60 days Texas RAIGA cure period before penalties accrue ($10K-$200K per violation)
2-15 days EU AI Act serious incident notification timeline (Article 73)
64 million job application records exposed through McDonald’s McHire AI platform default credentials (ISACA, December 2025)
AU$440,000 cost of the Deloitte Australia AI-assisted report containing 20+ fabricated sources
$5K-$15K estimated cost for a mid-market company to build an AI incident response capability
2-3 weeks time to adapt an existing cyber IR plan for AI-specific incidents

What This Means for Your Organization

Every AI system your company runs will eventually produce an output that is wrong, biased, or harmful. The question is not whether but when — and whether you will respond in hours or in weeks.

The practical starting point is less daunting than it appears. If your company already has a cyber incident response plan, you have 60-70% of the structure. What is missing is the AI-specific layer: the classification framework that distinguishes a hallucination from a data breach, the containment options that range from rate limiting to full shutdown, and the regulatory notification map that tells you which attorney general needs to hear from you and by when.

The tabletop exercise is the single highest-value investment. One hour, five people, one scenario: “Our AI-assisted client report contains fabricated sources and has been delivered to three clients.” Walk through detection, containment, investigation, notification, and communication. The gaps that emerge will tell you exactly what your playbook needs.

If mapping the regulatory notification obligations across your operating states or designing the tabletop scenario raises questions specific to your organization, I would welcome the conversation — brandon@brandonsneider.com.

Sources

Stanford AI Index Report 2025 — Documented AI safety incidents surging from 149 (2023) to 233 (2024), 56.4% increase. Independent academic research, high credibility. https://hai.stanford.edu/ai-index/2025-ai-index-report (April 2025)
Coalition for Secure AI (CoSAI), AI Incident Response Framework v1.0 — Industry consortium framework (Google, Microsoft, Amazon, IBM members) adapting NIST lifecycle for AI-specific threats. High credibility as multi-stakeholder standard. https://www.coalitionforsecureai.org/defending-ai-systems-a-new-framework-for-incident-response-in-the-age-of-intelligent-technology/ (November 2025)
OWASP GenAI Incident Response Guide 1.0 — Open-source, practitioner-developed guide covering preparation, detection, containment, eradication, recovery, and lessons learned for GenAI systems. High credibility as community-vetted standard. https://genai.owasp.org/resource/genai-incident-response-guide-1-0/ (July 2025)
Tuscano & Disso, “A Practical Incident-Response Framework for Generative AI Systems” (GenAI-IRF) — Academic paper identifying six recurrent incident archetypes, evaluated with inter-rater reliability (kappa = 0.88) and usability testing (SUS = 86.4). Peer-reviewed academic research, high credibility. https://www.mdpi.com/2624-800X/6/1/20 (January 2026)
IAPP, “AI incident response plans: not just for security anymore” — Five-category incident classification framework. Independent privacy professional association, high credibility. https://iapp.org/news/a/ai-incident-response-plans-not-just-for-security-anymore (January 2026)
ISACA, “Avoiding AI Pitfalls in 2026: Lessons Learned from Top 2025 Incidents” — Seven case studies including McDonald’s McHire, facial recognition wrongful arrests, deepfake fraud, Claude cyber-espionage, chatbot safety incidents. Independent professional association, high credibility. https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2025/avoiding-ai-pitfalls-in-2026-lessons-learned-from-top-2025-incidents (December 2025)
Moffatt v. Air Canada, BC Civil Resolution Tribunal — Legal precedent establishing company liability for AI chatbot misrepresentations. Court ruling, highest credibility. https://www.americanbar.org/groups/business_law/resources/business-law-today/2024-february/bc-tribunal-confirms-companies-remain-liable-information-provided-ai-chatbot/ (February 2024)
Colorado AI Act (SB24-205) — 90-day AG notification for algorithmic discrimination; enforcement delayed to June 30, 2026. Legislative text, highest credibility. https://leg.colorado.gov/bills/sb24-205 (May 2024, amended 2025)
Texas Responsible AI Governance Act (TRAIGA) — 60-day cure period, $10K-$200K penalties. Legislative text, highest credibility. https://www.nortonrosefulbright.com/en/knowledge/publications/c6c60e0c/the-texas-responsible-ai-governance-act (June 2025)
EU AI Act, Article 73 — Tiered serious incident reporting: 2 days (widespread), 10 days (death), 15 days (other serious). Legislative text, highest credibility. https://artificialintelligenceact.eu/article/73/ (August 2024)
GLACIS, AI Incident Response Playbook 2026 — Containment escalation framework and stakeholder communication matrix. Industry guidance, moderate-high credibility. https://www.glacis.io/guide-ai-incident-response (2026)
OECD, “Towards a Common Reporting Framework for AI Incidents” — 29-criteria global reporting framework. International organization, high credibility. https://www.oecd.org/en/publications/towards-a-common-reporting-framework-for-ai-incidents_f326d4ac-en.html (February 2025)
The Future Society, “AI Incidents Are Rising. It’s Time for the United States to Build Playbooks for When AI Fails.” — Policy analysis of US gaps in AI incident response. Independent think tank, moderate-high credibility. https://thefuturesociety.org/us-ai-incident-response/ (2025)
Deloitte Australia AI Hallucination Incident — AU$440,000 report with 20+ fabricated sources. Investigative reporting analysis, moderate credibility. https://medium.com/@PoornaReddy/the-290-000-ai-hallucination-what-caused-it-how-to-engineer-around-it-d39b7be1142e (January 2026)

Brandon Sneider | brandon@brandonsneider.com March 2026