5.4 Common Traps in Information Systems Operations and Business Resilience
Key Takeaways
- Swapping RTO and RPO is the single most common Domain 4 error: RTO = downtime/time-to-restore; RPO = data loss/backup frequency.
- Replication is not a backup: it copies corruption and ransomware to the standby, so it cannot satisfy a recovery-from-corruption requirement.
- An untested or out-of-date plan provides no real assurance, even if the document is detailed and approved.
- Recovery priorities must come from the BIA and business impact, not from technical ease or which system the IT team prefers.
- Don't pick a full-interruption test, or an emergency change without documentation, just because it sounds thorough or urgent.
The RTO / RPO Swap
The most expensive trap in this domain is reversing the two recovery objectives. The exam writes distractors that are correct definitions of the wrong metric. Anchor yourself:
- RTO is about time to restore → it drives recovery speed and therefore site choice (hot/warm/cold) and failover design.
- RPO is about data loss → it drives backup frequency and replication.
When a stem talks about how long you can be down, every answer that adjusts backup frequency is a trap, and vice versa. Also distinguish both from MTD (the total outer limit, RTO + WRT) and SDO (the reduced service level during recovery). A question asking for the "absolute maximum unavailability" wants MTD, not RTO.
| Cue in stem | Correct metric | Trap answer |
|---|---|---|
| "...down for no more than X" | RTO | adjusting backup frequency (RPO) |
| "...lose no more than X of data" | RPO | choosing a hotter site (RTO) |
| "...absolute maximum unavailability" | MTD | RTO alone |
| "...reduced service during recovery" | SDO | RTO |
Replication, Untested Plans, and IT-Driven Priorities
Trap 1 — "Replication is a backup." Synchronous replication keeps a standby copy current, which is excellent for hardware failure and low RTO. But it faithfully copies corruption, accidental deletion, and ransomware to the standby in real time. If a stem asks how to recover from data corruption or ransomware, the answer involves point-in-time backups, not replication. Treating replication as the whole strategy is a classic wrong choice.
Trap 2 — "The plan exists, so we're covered." A thick, approved BCP/DRP provides little assurance if it has never been tested or is out of date. The auditor's concern is always whether the plan was exercised and the results acted upon. An untested plan, an unmaintained call tree, or a DRP that predates a major infrastructure change are all valid findings even when the document looks complete.
Trap 3 — IT sets recovery priorities. Recovery sequencing must flow from the BIA and business impact, not from technical convenience or the system the IT team finds easiest to bring up first. An answer that restores the simplest system first, instead of the most business-critical one, is wrong.
Operations-Side Traps
The operations half of the domain has its own recurring distractors:
- "Urgent means skip the process." An emergency change still requires authorization and documentation (even retroactive). The right answer never endorses an undocumented production change.
- "Restore service" vs. "fix the cause." For an active outage, the immediate answer is incident restoration; the long-term answer for recurrence is problem management. Picking root-cause analysis as the immediate action is a timing trap.
- "Bigger test is always better." A full-interruption test is the most thorough but also the riskiest and is inappropriate as a first test or when production cannot be risked. Choose the test that matches the maturity and risk tolerance, not simply the most aggressive one.
- End-user computing (EUC). Business-built spreadsheets and shadow databases often lack change control, backups, and validation. The trap is treating EUC as low-risk; auditors flag ungoverned EUC that feeds financial or operational decisions.
- RAID is not redundancy against everything. RAID and clustering survive component failures but not a site loss, so they never replace off-site backups.
Subtle Wording Traps
Beyond the big conceptual swaps, Domain 4 distractors exploit precise wording. Train your ear for these:
- "Most appropriate" vs. "first." A question asking what to do first often wants a planning or assessment step (run the BIA, define the objective), while most appropriate wants the best end-state control. The same scenario can have different correct answers depending on this single word.
- "Hot site" as a reflex. Candidates over-pick the hot site because it sounds safest. If the application is non-critical or the stem stresses cost, a warm or cold site is the appropriate answer. The exam rewards matching the strategy to the stated RTO and budget, not maximizing capability.
- "Detective vs. preventive." Reviewing scheduler logs after a failed batch job is detective; an automated dependency check that blocks a job from running out of sequence is preventive. When a stem asks to prevent recurrence, a purely detective control is a trap.
- SDO mistaken for RTO. A stem about the reduced level of service during recovery is testing SDO, not RTO. Any answer naming downtime targets misses the point.
| If the stem emphasizes... | Lean toward... | Avoid the distractor that... |
|---|---|---|
| Cost / non-critical app | Cold or warm site | jumps to a hot site |
| Recover from corruption/ransomware | Point-in-time backup | offers replication only |
| Prevent recurrence | Problem management / preventive control | restores service only |
| Reduced service during recovery | SDO | quotes an RTO |
The through-line of every trap: a distractor that is true but answers a different question. Slow down enough to confirm the option resolves the specific objective, role, and timing the stem set, and the near-miss answers stop costing you points.
A ransomware attack has encrypted production data that is synchronously replicated to a standby site. Why is replication alone insufficient for recovery here?
During an audit, the IS auditor finds a detailed, board-approved disaster recovery plan that has never been tested. What is the most appropriate conclusion?
Which statement reflects the correct distinction between RTO and RPO?