7.6 Effect Size, Power, Clinical Significance, and Program Evaluation
Key Takeaways
- Effect size describes the magnitude of a finding, which can matter more for practice than whether a result merely crosses a decision threshold.
- Clinical significance asks whether change is meaningful for functioning, risk, symptoms, or quality of life.
- Program evaluation applies research logic to real services, using process, outcome, fidelity, and stakeholder data.
- Evidence-based practice integrates research evidence, clinical expertise, and client characteristics rather than following studies mechanically.
Move From Results to Practice Decisions
Research findings become useful for psychologists when they inform decisions responsibly. A result can be statistically detectable and still be too small, too narrow, or too uncertain to guide practice alone. This is why the EPPP research domain includes effect size, power, clinical significance, and program evaluation. These concepts help candidates move from a table of results to a defensible professional conclusion.
Effect size describes the magnitude of an effect. Common examples include Cohen's d for mean differences, r for association, odds ratios for likelihood comparisons, and risk ratios in outcome research. A small p value may reflect a large sample, while effect size tells more about practical magnitude. The exam may ask which result is more meaningful for clinical planning, policy, or program decisions.
| Applied concept | What it adds | Practice question it answers |
|---|---|---|
| Effect size | Magnitude of the finding | How large is the observed difference or association? |
| Power | Ability to detect a real effect | Was the study likely to miss a meaningful effect? |
| Clinical significance | Practical impact on client functioning | Did the change matter in daily life or risk reduction? |
| Fidelity | Match between intended and delivered service | Was the intervention implemented as planned? |
| Program evaluation | Evidence about a real service system | Is the program reaching, serving, and helping the intended population? |
Clinical significance is not identical to statistical evidence. A symptom score may change enough to be detectable in a study but not enough for a client to return to work, sleep safely, reduce risk, or meet treatment goals. Conversely, a small study may show change that appears clinically important but needs stronger evidence. Good judgment considers both statistical and clinical meaning.
Power matters because a weak study may fail to detect a real effect. Low power can result from a small sample, noisy measurement, weak manipulation, high attrition, or inappropriate analysis. A non-detectable result in an underpowered study does not prove that an intervention is ineffective. It means the evidence is limited. On the exam, the best answer often avoids overstatement in either direction.
Evidence-based practice is an integration task. Research evidence matters, but psychologists must also consider clinical expertise, client culture, preferences, developmental status, comorbidities, setting, risk, and available resources. The strongest EPPP answer usually selects, adapts, monitors, and documents care rather than applying a manual rigidly without regard to the person and context.
Program evaluation brings research methods into agencies and systems. A needs assessment asks what the population requires. A process evaluation asks whether the program is delivered as intended. An outcome evaluation asks whether targeted changes occur. A cost or efficiency evaluation asks whether benefits justify resources. Stakeholder feedback can identify acceptability, access barriers, and unintended effects.
Program evaluation also depends on good measurement. If a clinic claims success because attendance increased, the next question is whether client outcomes improved, services were delivered with fidelity, and access improved for the intended groups. If only satisfied completers respond to a survey, response bias may inflate positive conclusions. If staff change documentation halfway through the year, instrumentation can distort trends.
For Part 1 preparation, treat every applied research question as a chain. The design creates the evidence, the statistics summarize uncertainty and magnitude, clinical significance asks whether the change matters, and evidence-based practice asks how the finding should be used with a specific client or program.
What does effect size add to interpretation?
Which question best captures clinical significance?
A clinic wants to know whether its new intake program is delivered as designed. What evaluation focus is most relevant?