Single-Case Design Logic: Prediction, Verification, Replication
Key Takeaways
- Prediction is the forecast of what behavior would do if the current condition continued unchanged; a stable, steady-state baseline makes the forecast credible.
- Verification confirms that the predicted baseline pattern would have continued — demonstrated by reversing, by untreated tiers staying flat, or by behavior tracking criteria.
- Replication repeats the effect across phases, tiers, behaviors, settings, or participants, ruling out coincidence as the cause.
- Steady-state responding (stability) before a phase change is what justifies the prediction; baselines vary in length but should support a defensible forecast.
- Visual analysis (level, trend, variability, immediacy, overlap, consistency) is how prediction, verification, and replication are read off the graph.
The Three-Part Engine of Experimental Control
Single-case designs evaluate behavior repeatedly within and across conditions. The logic that converts that data into a causal claim runs in three steps, repeated as a unit:
- Prediction. Based on a stable baseline, the analyst forecasts what behavior would do if the current condition simply continued. A credible forecast requires steady-state responding — a baseline stable enough in level and trend to project forward.
- Verification. The analyst tests whether that predicted pattern would have held by demonstrating that, absent the IV, behavior stayed on its baseline path. Each design verifies differently.
- Replication. The analyst reproduces the effect — across phases, tiers, behaviors, settings, or participants — so that coincidence becomes an implausible explanation.
When all three are present and the data line up, experimental control is established: the IV is the most plausible cause, and rivals are ruled out.
A helpful image is the 'three-term' demonstration: every time the IV is turned on or reaches a new tier, you get a fresh chance to predict (forecast the untreated path), verify (confirm the path would have continued without the IV), and replicate (reproduce the effect). Designs differ only in how they arrange those three opportunities — a reversal stacks them in time, a multiple baseline stacks them across tiers — but the underlying logic is identical.
How Each Element Shows Up in the Graph
| Logic step | Question to ask | Graph cue |
|---|---|---|
| Prediction | What would happen if nothing changed? | Stable, interpretable baseline (steady state) |
| Verification | Would the old pattern have continued without the IV? | Return-to-baseline reversal, or untreated tiers staying flat |
| Replication | Did the effect happen again? | Repeated, datable change across phases or tiers |
| Experimental control | Are alternatives less plausible? | Consistent behavior change tied to each IV onset |
Prediction requires enough data to reveal a pattern. A baseline has no fixed required number of points, but it must be long enough to support a reasonable forecast — unless ethics or safety demand quicker action. An ascending baseline for severe self-injury may justify rapid treatment, yet that short, trending baseline weakens a simple treatment-effect claim because the forecast is shakier.
Verification differs by design. In a reversal (ABAB), returning to baseline tests whether behavior reverts — confirming the original prediction. In a multiple baseline, untreated tiers verify by remaining unchanged until intervention reaches them. In a changing-criterion design, behavior should track each new criterion step, verifying control at each level.
Steady State and Baseline Logic
Baseline logic is the foundation of prediction. The purpose of baseline is descriptive (a benchmark) and predictive (a forecast of the future). The cleaner the baseline — low variability, flat or interpretable trend — the more confidently you can predict and later detect a treatment effect. This is why analysts wait for stability (steady-state responding) before changing conditions when it is safe to do so. A baseline that is already improving makes any later 'effect' ambiguous, because behavior was heading there anyway.
Three baseline patterns recur on the exam, and each carries a different verdict:
- Stable (flat) baseline — ideal; supports a clean prediction and makes a treatment effect easy to detect.
- Trending baseline — if it trends toward the goal, a later 'improvement' is confounded; if it trends away from the goal, a treatment that reverses the trend is especially convincing.
- Highly variable baseline — steady state is not established; the analyst should identify and control the source of bounce before changing phases, or the prediction is weak.
A baseline need not hit a fixed number of points, but it should be long enough to forecast credibly — except when ethics or safety justify acting sooner, as with escalating self-injury.
Why Replication Wins, and the Six Features That Reveal It
Replication is the core protection against coincidence. A single behavior change after treatment might be explained by history or maturation. But repeated changes that occur only when, and each time, the IV is applied make those rival explanations far-fetched. The more times the effect replicates at planned moments, the stronger the control — three datable changes are far harder to dismiss than one.
Visual analysis reads six features together to evaluate the three logic steps: level (typical value), trend (direction and slope), variability (bounce within a phase), immediacy of effect (speed of change at the phase line), overlap (proportion of points sharing a range across adjacent phases; less overlap is stronger), and consistency (do similar conditions look similar?). No single feature decides the case.
These features map directly onto prediction, verification, and replication. A stable baseline with low variability makes the prediction trustworthy. An immediate, low-overlap change at the phase line, plus a return-to-baseline or untreated tier holding flat, supplies verification. Consistent effects repeated across phases or tiers supply replication. A demonstration with high overlap, delayed change, and an unstable baseline is hard to defend; one with low overlap, immediate change, and stable baselines that repeats across tiers makes a compelling claim of a functional relation.
In a multiple-baseline-across-behaviors design, how is VERIFICATION primarily demonstrated?
An analyst collects only two baseline data points, both stable, before introducing treatment for a mild, non-dangerous behavior. Why is this a weakness in the design logic?
A baseline for task completion is already trending upward before any intervention. The analyst introduces a prompting package and completion continues to rise. What is the main interpretive problem?