Containment, Eradication, and Recovery Decisions
Key Takeaways
- Containment limits damage, but the method must fit the business impact and evidence needs.
- Short-term containment buys time; long-term containment supports stable operation while root cause is removed.
- Eradication removes malware, persistence, exploited vulnerabilities, exposed credentials, and unsafe configuration.
- Recovery should restore from trusted sources and validate that the incident does not recur.
- Responders must balance speed, evidence preservation, safety, service availability, and attacker awareness.
Containment, Eradication, and Recovery Decisions
Containment, eradication, and recovery are closely related, but they are not the same. Containment limits active harm. Eradication removes the cause and attacker foothold. Recovery restores trusted service. A strong incident response answer chooses the action that matches the current phase and the risk.
Containment Choices
| Situation | Possible containment | Tradeoff |
|---|---|---|
| Malware beaconing from one laptop | Network isolate the host in EDR | Fast and targeted, but may alert the attacker |
| Compromised user account | Disable account, revoke sessions, reset credentials | Stops account use, but may interrupt business work |
| Ransomware on file server | Block SMB access, isolate server, disable suspected service account | Limits spread, but affects shared file access |
| Malicious IP scanning web app | Block source at WAF or firewall | Useful if source is limited, weak if attacker changes infrastructure |
| Suspicious cloud access key | Disable key, review recent API calls | Stops key abuse, but applications using it may fail |
Containment can be short term or long term. Short-term containment may isolate a host immediately. Long-term containment may keep a vulnerable system available behind stricter network controls until a patch or rebuild can be completed.
Decision Timeline
Scenario: A manufacturing company finds suspicious remote access to a production support server.
| Time | Evidence | Decision |
|---|---|---|
| 14:02 | VPN login from unusual country using engineer account | Increase severity and review identity logs |
| 14:07 | EDR shows remote shell on support server | Declare incident and preserve active session details |
| 14:10 | Server controls production reporting, not machinery | Isolate from internet-facing VPN path, keep internal reporting available |
| 14:18 | Same account used to access password vault | Disable account, revoke sessions, rotate accessed secrets |
| 15:05 | Persistence found as scheduled task | Remove persistence only after collecting required evidence |
| 17:40 | Clean rebuild ready | Restore service from known-good image and monitor |
This timeline shows why response is not always a single action. Pulling the power cord might stop activity, but it may destroy volatile evidence and cause unnecessary business outage. Leaving the system online may preserve evidence, but it can allow continued attacker activity. The incident commander should make risk-based decisions with technical and business input.
Eradication Actions
Eradication addresses root cause and footholds. Examples include removing malware, deleting unauthorized scheduled tasks, closing vulnerable remote access, patching exploited software, disabling unauthorized accounts, rotating exposed credentials, and removing malicious inbox rules or OAuth grants.
Do not confuse blocking an indicator with eradication. Blocking one IP address may reduce traffic, but it does not remove stolen credentials, persistence, or a vulnerable application.
Recovery Actions
Recovery should use trusted sources. A system can be restored from a clean backup, rebuilt from a gold image, redeployed from infrastructure as code, or returned after verified remediation. Recovery should include validation:
- Confirm patches and configuration fixes are present.
- Confirm malicious accounts, keys, tasks, and services are gone.
- Confirm monitoring is active.
- Confirm business owners can perform required functions.
- Watch for repeat indicators after service is restored.
Common Traps
- Reimaging systems before collecting evidence needed to determine scope.
- Restoring from a backup that already contains the persistence mechanism.
- Resetting a password but failing to revoke active sessions and tokens.
- Blocking one domain while ignoring the compromised host that generated the traffic.
- Returning a system to service without monitoring for recurrence.
- Treating containment as proof that the incident is over.
A cloud access key is confirmed stolen and currently being used. What is the best immediate containment action?
Which action is eradication rather than containment?
PBQ style: A ransomware process is active on one file server. Order the response actions.
Arrange the items in the correct order