Incident Eradication and Recovery

Key Takeaways

  • Eradication removes the threat and the underlying vulnerability; recovery restores systems to validated normal operation.
  • Recovery is governed by RTO (acceptable downtime) and RPO (acceptable data loss); these are business decisions, not IT defaults.
  • Restore from known-good, verified backups; never reintroduce the original vulnerability or restore from a compromised image.
  • Enhanced monitoring after recovery confirms the threat is truly gone before declaring the incident closed.
Last updated: June 2026

Incident Eradication and Recovery

After containment stabilizes the situation, eradication removes the threat and its root cause, and recovery returns systems to validated normal operation. CISM separates these on purpose: cleaning a symptom without closing the exploited weakness, then restoring straight back into production, is how organizations get reinfected within days.

Eradication

Eradication means eliminating the adversary's presence and the conditions that let them in. Typical actions:

  • Remove malware, web shells, backdoors, and attacker-created accounts.
  • Patch or reconfigure the exploited vulnerability so the same entry point is closed.
  • Reset and rotate all potentially exposed credentials, keys, and certificates (assume any credential touched by the attacker is compromised).
  • Rebuild from clean media when the integrity of a host cannot be assured — full rebuild is often safer than "cleaning" a deeply compromised system.

A frequent CISM trap restores service quickly but skips closing the vulnerability, guaranteeing recurrence. The manager's measure of done is that the root-cause control gap is fixed and owned, not merely that the malware file is deleted.

Recovery and its key metrics

Recovery is governed by two business-set objectives:

MetricDefinitionQuestion it answers
RTO (Recovery Time Objective)Maximum tolerable downtime before unacceptable harmHow fast must we be back?
RPO (Recovery Point Objective)Maximum tolerable data loss, measured in timeHow much data can we afford to lose?

RTO and RPO are business decisions the manager elicits from asset owners, not IT defaults. A 4-hour RPO means backups must run at least every 4 hours. Recovery steps the exam expects: restore from known-good, integrity-verified backups (confirm the backup predates the compromise and is malware-free), validate system functionality, and return to production in a phased manner — most critical, lowest-risk systems first — rather than flipping everything on at once.

Post-recovery monitoring

Declaring "recovered" too early is a tested mistake. The CSIRT applies enhanced monitoring on restored systems to confirm the threat does not reappear (attackers often retain a second foothold). Only when monitoring confirms stability does the manager formally close the incident and transition to lessons learned.

Manager-level traps

Watch for an option that restores from "the most recent backup" without checking whether that backup is after the intrusion — restoring a compromised image reintroduces the threat. Another trap pits speed (meet RTO) against assurance; the manager balances them, but never restores into an unpatched, still-vulnerable state just to hit a clock. The defensible recovery answer is validated, phased restoration from clean backups with monitoring, with the exploited weakness already eradicated.

Backups and the ransomware lesson

CISM treats backups as the linchpin of recovery, and ransomware has sharpened the testable points. Backups must be tested by periodic restore drills — an untested backup is an assumption, not a control. They must follow resilient design such as the 3-2-1 rule (three copies, two media types, one offsite) with at least one immutable or offline copy that attackers cannot encrypt or delete. The exam frames paying a ransom as a business decision of last resort with no guarantee of data return and possible legal/sanctions exposure; reliable, isolated backups are what let an organization refuse to pay.

A recovery plan dependent on backups stored on the same network the attacker compromised is a finding, not a recovery capability.

Validation and the return-to-operations decision

Before flipping systems back to production, the team validates: confirm the vulnerability is closed, malicious artifacts are gone, data integrity is intact, and credentials are rotated. The decision to return to operations is the manager's call, made with asset owners against RTO and risk tolerance. Phased restoration lets the team watch each tier for signs of reinfection before adding load, and a documented rollback plan covers the case where a restored system misbehaves.

Only after enhanced monitoring confirms stability does the manager declare full recovery — prematurely calling "all clear" is a recurring exam mistake because attackers commonly retain a dormant secondary foothold to re-emerge later.

Coordinating recovery with business continuity

When an incident has escalated into a disruption, eradication and recovery run alongside the business continuity and disaster recovery plans, and the manager must keep them coordinated rather than competing. The IRP focuses on removing the threat and restoring clean systems; the BCP keeps the business operating (manual workarounds, alternate sites, degraded service) while that happens. A common error is restoring the primary environment without confirming it is clean, or invoking a DR failover to infrastructure that shares the same compromised credentials or images — which simply moves the breach.

The manager validates that the recovery site and restored data are not themselves compromised, sequences restoration against business priority and RTO, and formally transfers control back to normal operations only when both the security team and business owners agree the environment is trustworthy. Recovery decisions are thus joint security-and-business decisions, documented and approved, never a unilateral IT switch-flip under time pressure.

The clean exam takeaway is that recovery is complete only when the threat is eradicated, systems are validated against a known-good baseline, monitoring confirms no recurrence, and the business owner formally accepts the return to normal operations.

Test Your Knowledge

After containing an intrusion that exploited an unpatched web server, the team is ready to restore service. What must the security manager ensure is completed during eradication BEFORE recovery?

A
B
C
D
Test Your Knowledge

An RPO of four hours for a critical database means which of the following?

A
B
C
D
Test Your Knowledge

Which recovery practice BEST reduces the risk of reintroducing the threat to production?

A
B
C
D