9.5 Statistics Quick-Reference and Practice Drills

Key Takeaways

In a normal distribution, about 68% of scores fall within one standard deviation of the mean, 95% within two, and 99.7% within three.
Mean, median, and mode coincide in a symmetric distribution; skew pulls the mean toward the tail.
Correlation never proves causation; a third variable or reverse causation may explain an association.
Reliability is consistency of measurement; validity is whether an instrument measures what it claims—an instrument can be reliable without being valid.

Last updated: June 2026

9.5 Statistics Quick-Reference and Practice Drills

The final cluster of Research items rewards quick, accurate recall of descriptive statistics, the normal curve, and the reliability/validity distinction. Build a one-page sheet from the tables below and drill it until each fact is automatic.

The normal distribution

Many CPCE items reference the normal (bell) curve, which is symmetric with mean = median = mode at the center. The empirical (68-95-99.7) rule is high-yield:

Within ±SD	Percent of scores	Approximate use
±1 standard deviation	~68%	Common range of typical scores
±2 standard deviations	~95%	Boundary for many cut scores
±3 standard deviations	~99.7%	Nearly all observations

A z-score expresses how many standard deviations a raw score lies from the mean (z = (X − M) / SD). A z of +1.0 sits at about the 84th percentile; a z of 0 is exactly the mean and the 50th percentile.

Central tendency and skew

Mean — arithmetic average; sensitive to outliers.
Median — middle value; resistant to outliers; best for skewed data or ordinal scales.
Mode — most frequent value; the only option for nominal data.

In a positively (right) skewed distribution, the long tail of high scores pulls the mean above the median. In a negatively (left) skewed distribution, the mean falls below the median. Remember: the mean chases the tail. So when a stem reports a few extreme high incomes among clients, the median better represents the typical client.

Reliability vs. validity

Concept	Question it answers	Examples
Reliability	Is the measure consistent?	Test-retest, internal consistency (Cronbach's alpha), inter-rater
Validity	Does it measure what it claims?	Content, criterion (concurrent/predictive), construct

The key relationship: an instrument can be reliable without being valid (a miscalibrated scale that is consistently five pounds off), but it cannot be valid without being reliable. Reliability is necessary but not sufficient for validity.

Correlation cautions

Correlation coefficients range from −1.0 to +1.0; the sign shows direction and the absolute value shows strength. A coefficient near 0 means little linear association. The cardinal rule: correlation does not equal causation, because a third (confounding) variable or reverse causation may drive the link. Squaring r gives the coefficient of determination (r²), the proportion of variance shared—an r of .60 means about 36% of variance is shared.

Drill protocol

Run mixed sets, not topic-blocked sets, so you must first identify what kind of item it is:

Design ID drill — read a vignette and name the design and what it can claim (cause, association, description).
Statistic-match drill — given a question and scale of measurement, name the correct test.
Validity-threat drill — spot the threat to internal or external validity in a one-line scenario.
Curve drill — convert between z-scores, percentiles, and the 68-95-99.7 bands.
Ethics drill — flag the missing consent, IRB, or honesty step in a research vignette.

Standard scores you should recognize

Beyond z-scores, the CPCE references several standard score systems built on the normal curve, because counselors interpret test results constantly. T-scores have a mean of 50 and SD of 10 (used on many personality inventories such as the MMPI). Standard scores on cognitive tests often use a mean of 100 and SD of 15. Stanines divide the distribution into nine bands (mean 5, SD ~2). Percentile ranks report the percentage of the norm group scoring at or below a value and are not equal-interval — the gap between the 50th and 55th percentile is far smaller in raw points than the gap between the 90th and 95th.

A common item gives a client's z-score or T-score and asks for the approximate percentile; anchoring on z = 0 (50th), z = +1 (84th), and z = −1 (16th) handles most of these quickly.

Types of reliability and validity, drilled

It pays to distinguish the sub-types because items name them specifically. For reliability: test-retest (stability over time), internal consistency (items measure the same construct, indexed by Cronbach's alpha), alternate-forms (two equivalent versions agree), and inter-rater (two scorers agree). For validity: content (items cover the domain), criterion-related — split into concurrent (correlates with a current measure) and predictive (forecasts a future outcome) — and construct (measures the abstract trait, supported by convergent and discriminant evidence).

A vignette describing whether an admissions test forecasts later GPA is testing predictive validity; one asking whether a depression scale's items hang together is testing internal consistency reliability.

Readiness markers

You are ready when you can, after a one-day break, look at an unlabeled stem and immediately route it: is this asking about design, statistic choice, validity, the normal curve, EBP, evaluation, or ethics? If you can name the concept and the single decision it controls, the domain is exam-ready. If you can only recognize vocabulary but freeze on application, return to the drills until routing is automatic. A practical benchmark: on a 20-item mixed set drawn from all five sections, aim for at least 80% correct with a one-sentence rationale for each answer and a one-sentence reason each distractor fails.

If your accuracy holds but your rationales are vague, you are recognizing patterns rather than understanding them—rebuild the weak concept from its definition before moving on.

Test Your Knowledge

On a normally distributed assessment with a mean of 100 and a standard deviation of 15, approximately what percentage of scores fall between 85 and 115?

About 50%

About 68%

About 95%

About 99.7%

Test Your Knowledge

A bathroom scale consistently reads five pounds heavier than a person's true weight every time. This instrument is:

Both reliable and valid

Reliable but not valid

Valid but not reliable

Neither reliable nor valid

Up Next

10.1 Timed Practice Strategy

Chapter 10: Final Review and Test Day

CPCE Study Guide

CPCE Counselor Preparation Comprehensive Examination

9.5 Statistics Quick-Reference and Practice Drills

Key Takeaways

9.5 Statistics Quick-Reference and Practice Drills

The normal distribution

Central tendency and skew

Reliability vs. validity

Correlation cautions

Drill protocol

Standard scores you should recognize

Types of reliability and validity, drilled

Readiness markers

CPCE Study Guide

1Chapter 1: CPCE Orientation and Exam Strategy

2Chapter 2: Professional Counseling Orientation and Ethical Practice

3Chapter 3: Social and Cultural Diversity

4Chapter 4: Human Growth and Development

5Chapter 5: Career Development

6Chapter 6: Counseling and Helping Relationships

7Chapter 7: Group Counseling and Group Work

8Chapter 8: Assessment and Testing

9Chapter 9: Research and Program Evaluation

10Chapter 10: Final Review and Test Day

CPCE Counselor Preparation Comprehensive Examination

9.5 Statistics Quick-Reference and Practice Drills

Key Takeaways

9.5 Statistics Quick-Reference and Practice Drills

The normal distribution

Central tendency and skew

Reliability vs. validity

Correlation cautions

Drill protocol

Standard scores you should recognize

Types of reliability and validity, drilled

Readiness markers