5.4 Specialized Risk Analyses & Reserve Determination

Key Takeaways

  • FMEA evaluates failure modes and their effects; its Risk Priority Number RPN = Severity × Occurrence × Detection, ranking what to fix first.
  • Fault Tree Analysis (FTA) works top-down from an undesired top event through logic gates to find combinations of causes that produce it.
  • Contingency reserve is sized from quantitative results — typically the P80 cost minus the deterministic estimate, or the summed EMV of identified risks.
  • Contingency reserve covers known risks and sits inside the cost baseline; management reserve covers unknown risks and sits outside the baseline.
  • Escalate to specialized or quantitative analysis when overall exposure is high, qualitative ranking is insufficient, or safety/regulatory stakes demand defensible numbers.
Last updated: June 2026

Failure Mode and Effects Analysis (FMEA)

Failure Mode and Effects Analysis (FMEA) systematically examines a product or process to identify how each component can fail (failure modes), what the effect of each failure would be, and how to prioritize fixes. It is common in engineering, manufacturing, and safety-critical work.

Each failure mode is scored on three 1–10 scales and combined into a Risk Priority Number (RPN):

RPN = Severity × Occurrence × Detection

Higher RPN = higher priority. Note that Detection is scored so that harder to detect earns a higher number — a failure you cannot catch early is more dangerous and rises up the list.

FMEA is a bottom-up, inductive technique: you start at the component or process-step level and reason upward to effects. It originated in aerospace and automotive engineering and is now used wherever process or product reliability matters. Two variants appear on the exam — Design FMEA examines product design weaknesses, while Process FMEA examines weaknesses in how the work is performed. Both use the same RPN formula and the same prioritize-then-mitigate logic.

Worked FMEA Example

Two failure modes on a pump assembly:

Failure modeSeverityOccurrenceDetectionRPN
Seal leak74384
Sensor drift558200

Sensor drift scores 200 despite lower severity, because it is hard to detect (8) and occurs moderately often. The team addresses it first. FMEA's value is exactly this — it surfaces dangerous-but-quiet failures that a severity-only ranking would miss. After mitigation, the team rescores: improving detection on the sensor from 8 down to 2 (by adding an alarm) drops its RPN from 200 to 50, reordering the priority list and demonstrating progress quantitatively.

Fault Tree Analysis (FTA)

Fault Tree Analysis (FTA) is a top-down, deductive technique. You start with a single undesired top event (e.g., "system outage") and work downward through logic gates to map the combinations of lower-level faults that could cause it.

  • AND gate — all inputs must occur for the output (provides redundancy; lowers probability)
  • OR gate — any single input causes the output (raises probability)

FTA is ideal for safety and reliability analysis where you need to understand how a catastrophic event arises and which single points of failure to engineer out. Contrast it with FMEA's bottom-up, component-by-component approach.

The direction is the key exam discriminator: FTA is top-down (start with the bad outcome, find the causes), while FMEA is bottom-up (start with each component failure, trace the effects). FTA finds the minimal cut sets — the smallest combinations of faults sufficient to trigger the top event — which is exactly what reliability engineers need to design out single points of failure. If a question describes starting from one undesired system-level event and decomposing causes through AND/OR gates, the answer is fault tree analysis.

Determining Contingency Reserve

Quantitative results feed directly into contingency reserve sizing. Two standard approaches:

  • From simulation: reserve = (chosen confidence value, e.g., P80 cost) − (deterministic estimate). The buffer closes the gap to the desired confidence level.
  • From EMV: sum the EMV of identified risks (threats minus opportunities) to get the expected reserve.

The result is a risk-adjusted estimate — the base estimate plus a reserve justified by analysis, not a guess. This is far more defensible to a sponsor than a flat 10% padding.

Reserve sizing is not a one-time event. As risks close, occur, or change, reserve analysis during monitoring compares the reserve still held against the exposure still open, releasing surplus reserve back to the organization or flagging a shortfall. A reserve that is never revisited drifts out of alignment with the real risk picture, so the PMI-RMP exam treats reserve as a living quantity tied to the current state of the risk register, not a fixed figure set once at planning.

Reserves, Integration, and When to Escalate

Contingency vs management reserve is heavily tested:

Contingency reserveManagement reserve
CoversKnown risks (known-unknowns)Unknown risks (unknown-unknowns)
LocationInside the cost baselineOutside the baseline, in the budget
AuthorityPM can spend itRequires management approval

Integrated cost-schedule risk analysis models both together, because schedule slips usually drive cost overruns — analyzing them separately understates exposure.

Escalate to specialized or quantitative analysis when: overall exposure is high relative to the budget; qualitative ranking cannot separate the top risks; the sponsor demands confidence figures; or safety/regulatory stakes (where FMEA or FTA apply) require defensible, auditable numbers.

Match the tool to the question. Use EMV or decision trees for discrete choices and single-risk valuation; Monte Carlo for overall cost and schedule ranges; FMEA for ranking process or product failure modes; and FTA for tracing one catastrophic event to its causes. Choosing the wrong technique — for example reaching for Monte Carlo when the question is really a build-vs-buy decision best served by a decision tree — is a classic distractor pattern on the exam, so read the stem for whether it asks about overall exposure, a discrete decision, component failures, or a single top event.

Test Your Knowledge

In FMEA, a failure mode has Severity = 6, Occurrence = 3, and Detection = 9. What is its Risk Priority Number (RPN)?

A
B
C
D
Test Your Knowledge

A Monte Carlo analysis gives a P80 cost of $4.6M; the deterministic estimate is $4.0M. The team adds $600K as a buffer the project manager can spend on identified risks. What is this $600K?

A
B
C
D