9.1 Research and Program Evaluation Overview

Key Takeaways

  • Research and Program Evaluation is one of eight CACREP core areas on the CPCE, contributing 20 of the 160 items (about 12.5% of the test).
  • The domain covers research designs, sampling, validity, statistics, evidence-based practice, needs assessment, and program/outcome evaluation.
  • Counselors are consumers of research first: you must read, critique, and apply findings ethically, not just calculate.
  • Master the four scales of measurement and the matching statistics early; they anchor a large share of the items.
Last updated: June 2026

9.1 Research and Program Evaluation Overview

Research and Program Evaluation is one of the eight CACREP (Council for Accreditation of Counseling and Related Educational Programs) core curricular areas tested on the CPCE (Counselor Preparation Comprehensive Examination). The CPCE delivers 160 multiple-choice items, of which 136 are scored and 24 are unscored pretest items, in a 3-hour-45-minute window. Each of the eight areas contributes 20 questions, so this domain is worth roughly 12.5% of the test. There is no national cut score: each program sets its own pass mark, often near the national mean minus one standard deviation.

What the domain actually covers

The CACREP standard frames this area as research literacy for a practicing counselor, not a biostatistics course. Expect items on:

Topic clusterRepresentative content
Research designsExperimental, quasi-experimental, correlational, descriptive, qualitative, single-subject
SamplingRandom, stratified, cluster, convenience; sampling error and generalizability
ValidityInternal vs. external validity; threats (history, maturation, regression to the mean)
StatisticsDescriptive (mean, median, mode, SD) and inferential (t-test, ANOVA, correlation, chi-square)
MeasurementNominal, ordinal, interval, ratio scales; reliability and validity of instruments
Evidence-based practiceHierarchy of evidence, treatment-outcome research, empirically supported treatments
Program evaluationNeeds assessment, formative vs. summative evaluation, outcome measurement

Counselor as research consumer

The ACA Code of Ethics (Section G) frames counselors first as consumers and ethical conductors of research. You are expected to read a study, judge whether its design supports its claims, and decide whether to apply it. So a stem may describe a published finding and ask whether a counselor can ethically generalize it, or which design flaw undermines the conclusion. Pure computation is rare; reasoning about what a number means is common.

A high-yield anchor: scales of measurement

The four scales of measurement drive both instrument and statistic choices, and they recur across the domain:

  • Nominal — categories with no order (gender, diagnosis). Use frequencies, mode, and chi-square.
  • Ordinal — ranked but unequal intervals (Likert ratings, class rank). Use median and nonparametric tests.
  • Interval — equal intervals, no true zero (IQ, temperature in Fahrenheit). Means and SDs are valid.
  • Ratio — equal intervals with an absolute zero (age, reaction time, number of sessions). All math is valid.

A classic trap: a stem reports a Likert satisfaction scale and asks for the best central-tendency measure. Because Likert data are ordinal, the median is technically the most defensible answer, even though counselors routinely report means.

How items are written

Most stems are short application vignettes: a counselor designs a study, reads a journal article, or evaluates an agency program. The strongest distractors are true statements that simply do not answer the specific question asked. Read for the exact verb — describe, compare, establish cause, predict — because that verb points to the correct design or statistic. Build a one-page sheet pairing each design and statistic with the scale of measurement it requires, and rehearse it until recall is automatic.

Why this domain is worth deliberate study

Many counseling students enter the CPCE strong in clinical content but weak in statistics, so this area is where scores most often sag. Because it is only 20 items, a focused two-week review of designs, validity threats, and the descriptive/inferential split usually moves the most points per hour studied. Treat it as a vocabulary-plus-logic area: learn the term, then learn the one decision it controls.

Independent and dependent variables

Nearly every research item turns on two terms. The independent variable (IV) is the factor the researcher manipulates or groups by — the presumed cause, such as the type of therapy a client receives. The dependent variable (DV) is the measured outcome that may change, such as a depression score. A reliable habit is to restate any study as "the effect of [IV] on [DV]." If a stem reads "the effect of mindfulness training on test anxiety," mindfulness training is the IV and test anxiety is the DV.

Extraneous or confounding variables are uncontrolled factors that could offer a rival explanation; good design holds them constant or randomizes them away. Mislabeling the IV and DV is one of the most common ways candidates lose otherwise easy points.

Hypotheses and the logic of testing

Research is structured around a testable prediction. The alternative hypothesis (H1) states the expected difference or relationship; the null hypothesis (H0) states no difference or relationship. Researchers never "prove" H1 directly — they gather evidence to reject H0, then accept H1 by elimination. A directional (one-tailed) hypothesis predicts which way the effect goes (treatment scores will be lower), while a non-directional (two-tailed) hypothesis predicts only that a difference exists. Recognizing this logic prevents the frequent error of treating the null as the researcher's actual prediction.

A study-planning checklist

When a stem describes designing a study, the defensible order is: (1) state the research question and hypotheses; (2) define and operationalize the IV and DV; (3) choose a design that matches the claim; (4) select a sampling method and identify the population; (5) pick the statistic that fits the design and scale; and (6) plan ethical protections. Items often present a study that skipped one step and ask what is wrong — usually a design that cannot support the causal claim being made.

Test Your Knowledge

A counselor surveys clients using a 5-point Likert satisfaction scale (1 = very dissatisfied to 5 = very satisfied). Which scale of measurement do these data represent?

A
B
C
D
Test Your Knowledge

In research, the null hypothesis states:

A
B
C
D