Incident Response··8 min read

Writing an incident response playbook that actually works

Most incident response playbooks have never been used. The ones that get used are not the ones with the most pages. They are the ones with the fewest assumptions.

Pull up your incident response playbook. Read the first page. How many of the names, phone numbers, Slack channels, and tools listed there are still accurate? In my experience the answer is rarely more than half.

An incident response playbook is one of the few security documents that is going to get tested under actual stress. Most security policies sit in a drawer until an auditor reads them. The IR playbook is going to be opened by a tired engineer at 2am while production is on fire and a customer is on the phone. The standard for what counts as a useful playbook is much higher because of that.

This is what I write when a retainer client asks me to put together an IR playbook. It is intentionally short and built to survive contact with a real incident at 2am.

Why most playbooks fail at hour one

The patterns repeat across every company I have worked with. The same five things break in the first hour of a real incident:

  • The on-call rotation in the playbook does not match the actual on-call rotation in PagerDuty. The playbook lists the security lead from two reorgs ago.
  • The Slack channel for incident coordination is in a workspace half the responders do not have access to.
  • The legal escalation path goes to in-house counsel who is on parental leave. The outside counsel relationship has never been activated.
  • The forensics tooling is licensed but nobody has logged in for six months and the SSO integration has expired.
  • The communication template for customer notification was last updated before the company moved to a new domain and uses an email address that bounces.

None of these surface in a policy review. All of them surface in the first hour of an actual incident or a serious tabletop. The fix is not a longer playbook. The fix is a playbook that is structurally less brittle.

The five questions every playbook must answer

When the playbook is opened at 2am, the responder is not going to read 40 pages. They are going to scan for five answers:

  • Who is in charge right now? One name, one phone number, one backup. Not a committee, not an org chart, not an escalation tree five levels deep.
  • Where do we coordinate? One Slack channel everyone has access to in advance. One bridge line as a backup. Tested.
  • What can I shut down without permission? A short, explicit list of unilateral actions any responder can take immediately. Revoke an access key. Disable a user. Block an IP. Do not require an executive's approval for these.
  • When do we tell whom? A simple table mapping severity to notification audience and timeline. P1 with confirmed customer data exposure goes to legal within 30 minutes. P2 goes to engineering leadership within 2 hours. Etc.
  • What does done look like? Explicit criteria for declaring the incident closed. Otherwise it never closes and the post-mortem never happens.

Five questions, five answers, one page. The next forty pages are scenario-specific runbooks. The first page has to work without anyone reading the next forty.

The minimum viable IR playbook

If you have nothing today, here is the smallest version that is still useful. Six sections, none longer than half a page:

  • Roles: incident commander, technical lead, communications lead, legal liaison. Names and phone numbers, with backups.
  • Severity definitions: what makes something a P1 versus a P2 versus a P3, with one or two examples each.
  • Coordination: where the incident channel lives, how the bridge line works, and a backup if the primary fails.
  • Unilateral authorities: the explicit list of actions any responder can take without approval.
  • Notification matrix: severity-to-audience-to-timeline table. Customers, legal, regulators, board, insurance carrier.
  • Post-incident review: who runs it, what gets documented, who signs off. Three to five business days after closure, not three weeks.

Three pages. Get it through legal review. Get every responder to read it once. That is the floor. Everything else is improvement on the floor, not a substitute for it.

Common mistakes I have to remove from existing playbooks

When I take over an IR program from a previous consultant or a previous internal team, the same patterns appear in the existing document:

  • A communication tree with seven names and four backup names. In a real incident this collapses into the first two people who answer their phones. Plan for that reality.
  • A 12-page section on forensic preservation that nobody is going to read. If you need that depth, link to it as an appendix. Do not put it on page 4.
  • Generic ransomware and phishing scenarios copied from a vendor whitepaper. They do not match your actual stack and your team will not follow them. Write scenarios specific to your environment, even if there are only two.
  • A regulatory notification table that lists every privacy regulation in the world. Cut it down to the regulations that actually apply to you. The rest is noise that delays decisions in a real incident.

Test it before you write the next page

The single most useful thing you can do with an IR playbook is exercise it. Pick a Tuesday afternoon, pick a scenario, walk the team through it for two hours. Force them to use only the tools they actually have and the people they can actually reach in the moment.

The results are always uncomfortable. Tooling gaps. Communication gaps. Authority gaps. Write down every friction point. Rank them. Fix the top three before the next tabletop. Run the tabletop again in 90 days. Repeat until the tabletop is boring because the playbook actually works.

Most companies test the playbook once per year because that is what the auditor wants. The right cadence is quarterly until the playbook is stable, then twice a year ongoing. A playbook is a living document, not an artifact.

More reading

Get the scorecard this post is based on.

Twenty questions, scored PDF, realistic timeline to audit. Takes 4 minutes.

Start the scorecard

Ready when you are

Your next move starts with a 30 minute call.

If vCISO.com is not a fit, we will say so on the call and point you toward someone who is. If we are, we will scope a Sprint, the 90-Day Foundation, or a retainer right then.