Scenario Coverage and Threshold Setting
Key Takeaways
- A monitoring scenario is a rule designed to detect a specific typology; coverage gaps mean a typology has no scenario watching for it.
- Thresholds trade off false positives against false negatives: too tight floods analysts, too loose misses suspicious activity.
- Above-the-line and below-the-line testing validates that thresholds catch true positives without ignoring just-below-threshold risk.
- Tuning and scenario coverage must be documented, periodically reviewed, and approved through model governance, not changed informally.
Scenarios, Coverage, and Typologies
A transaction-monitoring scenario (also called a rule or model) is logic designed to detect one specific money-laundering or sanctions typology, such as structuring (cash deposits just under the USD 10,000 Currency Transaction Report threshold), rapid movement of funds (deposits immediately wired out), or round-dollar wires to high-risk jurisdictions. Scenario coverage is the question of whether the institution's set of scenarios actually addresses the typologies present in its risk assessment.
A coverage gap exists when a recognized risk has no scenario watching for it. For example, if a bank serves money-services businesses but has no scenario for funnel-account activity, that typology can pass undetected regardless of how well other scenarios are tuned. The exam frequently rewards identifying the missing scenario rather than re-tuning an existing one.
| Scenario | Typology detected | Typical parameter |
|---|---|---|
| Structuring | Cash broken below CTR limit | Multiple cash deposits summing near USD 10,000 in a short window |
| Rapid in/out | Layering / pass-through | Funds out within N days of deposit |
| High-risk geography | Placement/integration via risky jurisdictions | Wires to/from FATF-listed countries |
| Dormant-then-active | Account takeover / mule | Long inactivity then sudden high volume |
Mapping each material risk in the enterprise risk assessment to at least one scenario is the core coverage discipline.
Coverage analysis also accounts for segmentation. The same threshold cannot fit every customer: a USD 50,000 wire is unremarkable for a corporate treasury client but extraordinary for a student checking account. Effective monitoring segments customers by type, product, and expected activity, then applies scenario thresholds calibrated to each segment's baseline. A frequent design failure the exam tests is applying a single bank-wide threshold, which buries genuinely anomalous retail activity under the noise of legitimate commercial volume while flagging routine business transactions as suspicious.
Segmentation lets the institution detect deviation from expected behavior, which is more powerful than a flat numeric cutoff. New products, new geographies, and new customer types should each trigger a coverage review to confirm a scenario exists for the risks they introduce.
Thresholds, the Error Tradeoff, and Tuning
A threshold is the parameter value that triggers an alert (e.g., "flag aggregate cash above USD 9,000 in 24 hours"). Threshold setting is a direct tradeoff:
- A tight (low) threshold catches more activity but generates many false positives (alerts that are not suspicious), overwhelming analysts and raising cost.
- A loose (high) threshold reduces noise but increases false negatives (genuinely suspicious activity that never alerts) the more dangerous error for AML compliance.
Tuning adjusts thresholds and logic to optimize this balance. Two validation techniques are tested:
- Above-the-line (ATL) testing samples alerts that did fire to confirm they are productive (true positives) and the threshold is not too tight.
- Below-the-line (BTL) testing samples activity just below the threshold to confirm the institution is not missing suspicious activity by setting the bar too high.
Worked scenario
Analysts complain that a structuring scenario produces 95% false positives. A tempting but wrong move is to raise the threshold quietly to cut volume. The correct, defensible approach: run BTL testing first to confirm raising the threshold would not hide real structuring; document the analysis; obtain model-governance approval; then change the parameter and retain the rationale. Informal tuning that is undocumented is an exam red flag.
It is worth understanding why below-the-line testing carries so much weight with examiners. Above-the-line testing only confirms the alerts you already generate are useful; it cannot reveal what you are missing. Below-the-line testing deliberately looks at the activity sitting just under the trigger, the exact place where a launderer who knows the threshold will operate (structuring is, after all, the act of staying just below a reporting limit).
If a bank raises a threshold without BTL testing and a regulator later finds suspicious activity clustering immediately below the new cutoff, the institution has both a detection failure and a governance failure. The defensible record shows the threshold was set by analysis, tested in both directions, approved, and periodically revisited as the customer base and typologies change.
Exam reminders:
- A high false-positive rate is a tuning problem, not a license to suppress alerts.
- False negatives are the costlier error from a compliance and regulatory-risk standpoint.
- Below-the-line testing is the key safeguard against setting thresholds too high.
- Every scenario and threshold change must be documented, tested (ATL/BTL), and governed.
- Coverage is mapped back to the enterprise-wide risk assessment, with thresholds calibrated by customer segment.
There is one more concept the exam expects you to connect: scenario coverage is a living obligation, not a one-time setup. Money launderers adapt, products change, and customer behavior shifts, so a scenario set that was adequate two years ago may now have gaps. A mature program performs periodic coverage assessments that re-map current risks to current scenarios and identify typologies with no rule watching for them. It also tracks scenario performance over time, retiring rules that fire constantly without ever producing a report (pure noise) and strengthening rules tied to confirmed cases.
The exam-correct mindset treats the monitoring system the way it treats the risk assessment: something that must be reviewed, tested, tuned, and re-approved on a defined cycle and whenever the business changes materially, with every adjustment documented for the examiner who will eventually ask why the parameters are what they are.
Analysts report a structuring scenario yields 95% false positives. What is the most defensible response?
Which monitoring error is generally the costliest from an AML compliance and regulatory-risk perspective?