Reversal, Multiple-Baseline, Multielement, Changing-Criterion Designs
Key Takeaways
- Reversal (ABAB) designs show control by introducing and withdrawing the IV; powerful only when behavior is reversible and withdrawal is ethical.
- Multiple-baseline designs show control through staggered intervention across behaviors, settings, or participants — ideal when reversal is unethical or behavior is non-reversible (skill acquisition).
- Multielement (alternating-treatments) designs rapidly alternate conditions to compare them; require clear discrimination and watch for carryover/multiple-treatment interference.
- Changing-criterion designs show control when behavior tracks stepwise criterion changes; best for gradual, progressive goals.
- Design choice hinges on reversibility, ethics, behavior type, treatment interaction, baseline stability, and feasibility — not on memorizing one 'best' design.
Reversal and Multiple-Baseline Designs
A reversal design (ABAB) compares behavior when the IV is absent (A phases) and present (B phases). It demonstrates control because behavior should change with treatment, return toward baseline when treatment is withdrawn, and change again when treatment is reinstated — verification and replication in one structure. It is powerful when behavior is reversible and withdrawal is ethical.
The reversal is weak or inappropriate when withdrawing an effective treatment would be unsafe (e.g., severe self-injury), or when the behavior cannot return to baseline because learning produced a durable change (most skill acquisition). The ABAB ends deliberately on a treatment phase, which is both more ethical (the client leaves with the effective condition in place) and a second replication of the effect.
A multiple-baseline design staggers the introduction of the IV across behaviors, settings, or participants. Control is shown when each tier changes only after intervention reaches it, while the still-untreated tiers stay near their predicted baselines. It is the go-to design when reversal is unethical or impossible — including for irreversible skills. Its key requirement: the tiers must be functionally independent enough that treating one does not change the others (which would otherwise look like loss of control).
The trade-off is that a multiple baseline never demonstrates reversal, so it cannot rule out that some untreated tier would have changed on its own; this is why having at least three tiers strengthens the demonstration. With three staggered changes, a coincidental external event would have to strike three times at three planned moments, which is highly implausible.
Multielement and Changing-Criterion Designs
A multielement design (alternating-treatments design, ATD) rapidly alternates two or more conditions — often within or across sessions — and compares the resulting data paths. Control is shown by clear, consistent separation between conditions. It is efficient for comparing interventions or assessment conditions (and is the engine of a functional analysis). Its main cautions are multiple-treatment interference / carryover (one condition affecting the next) and the need for the participant to discriminate which condition is in effect, often via distinct signals.
A changing-criterion design evaluates whether behavior changes in a stepwise pattern as a performance criterion is progressively adjusted. Control is shown when behavior closely tracks each new criterion — speeding up or slowing down to match each step. It suits gradual, progressive goals, such as increasing minutes of exercise or systematically reducing cigarettes per day. Each criterion step should be distinct enough to detect that behavior is tracking, and bidirectional or 'mini-reversal' steps can strengthen the demonstration.
| Design | Best fit | Main caution |
|---|---|---|
| Reversal (ABAB) | Reversible behavior and ethical withdrawal | Withdrawal may be harmful or impossible |
| Multiple baseline | Non-reversible skills or unsafe withdrawal | Tiers must be independent |
| Multielement (ATD) | Rapid comparison of two or more conditions | Carryover and poor discrimination |
| Changing criterion | Gradual, stepwise performance goals | Steps must be achievable and detectable |
Variations Within the Major Designs
Each family has variants the exam may name. Reversal variants include the basic A-B (no control — descriptive only), the A-B-A (one reversal), the full A-B-A-B (ends on treatment, which is both more ethical and a stronger replication), and the B-A-B used when starting treatment immediately is necessary. A bare A-B is not an experimental design because it lacks verification and replication.
Multiple-baseline variants are across behaviors, across settings, and across participants (subjects). A related multiple-probe design is used when continuous baseline measurement is impractical or could be reactive (e.g., repeatedly testing an untaught academic skill); it samples baseline intermittently instead.
Multielement designs can include a no-treatment or baseline 'element' to anchor the comparison, and they may use distinct discriminative stimuli (colored cards, different therapists) to help the participant discriminate conditions and to reduce carryover.
Fast Selection Rules and Common Traps
Use these heuristics to pick a design from a scenario:
- Can you safely turn the IV on and off, and will behavior reverse? → Reversal (ABAB).
- Is withdrawal unethical, or is the target a skill that won't un-learn? → Multiple baseline.
- Is the question 'which condition works better'? → Multielement (alternating treatments).
- Is the target a gradual march toward a terminal level? → Changing criterion.
Now watch the exam's favorite traps. A reversal for a newly taught skill is wrong because the skill won't reverse — pick a multiple baseline instead. A reversal for dangerous behavior is an ethics problem, not merely a design problem; withdrawing an effective treatment to prove control can harm the client.
A multielement design without distinct signals invites discrimination failure and carryover, blurring the comparison. A multiple baseline with non-independent tiers loses control when treating one tier changes another — here, generalization works against the demonstration. And a changing-criterion design with steps too small, or behavior that overshoots the criterion, fails to show clean tracking.
Finally, design selection balances rigor with ethics and feasibility. The 'correct' answer is usually the design that demonstrates prediction, verification, and replication while keeping the client safe and the procedure practical in the real setting. When two designs both work, prefer the one that needs no withdrawal of an effective, safety-relevant treatment.
An analyst teaches three new functional communication responses (asking for help, for a break, and for attention) to a learner who currently has none. The behaviors will not 'un-learn.' Which design best demonstrates experimental control?
A reinforcement procedure has nearly eliminated a child's dangerous head-banging. To 'prove' the treatment caused the change, a supervisee proposes withdrawing it to see if head-banging returns. What is the best response?
An analyst wants to know whether differential reinforcement (DRA) or noncontingent reinforcement (NCR) reduces problem behavior more for a given client. Which design is most efficient?
Which scenario is the BEST fit for a changing-criterion design?