7.5 Descriptive, Inferential, and Predictive Statistics

Key Takeaways

  • Descriptive statistics summarize data, while inferential statistics estimate what results suggest beyond the observed sample.
  • The appropriate statistic depends on variable scale, design, distribution, and research question.
  • Correlation describes association, regression predicts outcomes, and group-comparison tests evaluate mean differences under specified assumptions.
  • Confidence intervals, Type I error, Type II error, and power describe uncertainty around statistical decisions.
Last updated: May 2026

Choose the Statistic That Fits the Question

Statistics give structure to evidence. Descriptive statistics summarize what was observed. Inferential statistics estimate what the observed data suggest about a broader population or process. Predictive statistics estimate outcomes from one or more variables. EPPP questions usually do not require long calculations, but they often require knowing which statistic fits the design and variables.

Start with measurement level. Nominal variables are categories without rank, such as treatment group or diagnosis category. Ordinal variables have order but unequal intervals, such as rankings. Interval and ratio variables have more numeric structure, with ratio variables also having a meaningful zero. The statistic must fit the scale and the question.

Question typeCommon statisticKey cue
Summarize central tendencyMean, median, or modeUse median when skew or outliers distort the mean.
Summarize variabilityStandard deviation, range, or interquartile rangeStandard deviation describes spread around the mean.
Test association between two continuous variablesPearson correlationLinear relationship with appropriate assumptions.
Test association between categorical variablesChi-square testCounts or frequencies in categories.
Compare two meanst testIndependent or paired version depends on design.
Compare more than two meansAnalysis of varianceFollow-up tests may be needed after an omnibus result.
Predict an outcomeRegressionOne or more predictors estimate a criterion variable.

The mean is sensitive to outliers. The median can better represent skewed distributions. The mode is the most frequent value and can be useful for nominal data. Standard deviation describes variability around the mean, while variance is the squared standard deviation. A z score expresses how far a score is from the mean in standard-deviation units.

Correlation coefficients range from negative to positive values, with the sign showing direction and the magnitude showing strength. A negative correlation means higher values on one variable tend to go with lower values on the other. Correlation does not prove causation. Regression can use one or more predictors to estimate an outcome, but a regression model remains limited by design, measurement, and omitted variables.

Group comparison statistics depend on design. An independent-samples t test compares two separate groups. A paired-samples t test compares related scores, such as pretest and posttest in the same people. Analysis of variance compares means across more than two groups or across factors. A significant omnibus analysis of variance indicates that not all group means are equal, but it does not by itself identify every pairwise difference.

Inferential statistics involve error. A Type I error occurs when a researcher rejects a true null hypothesis. A Type II error occurs when a researcher fails to reject a false null hypothesis. Power is the probability of detecting an effect when it is truly present. Larger samples, stronger effects, reliable measurement, and appropriate design can improve power.

Confidence intervals communicate uncertainty around an estimate. A narrow interval suggests more precision than a wide interval, assuming the method is appropriate. On the exam, confidence intervals are often useful because they shift attention from only whether a result crosses a decision threshold to the plausible range of values.

When statistics questions feel dense, translate the stem into plain language. Is the researcher summarizing, comparing, associating, predicting, classifying, or estimating precision? Once the task is clear, the correct statistic is usually the one that respects the variable type and study design.

Test Your Knowledge

Which statistic is most appropriate for testing an association between two categorical variables represented as counts?

A
B
C
D
Test Your Knowledge

What does a negative correlation indicate?

A
B
C
D
Test Your Knowledge

What is a Type II error?

A
B
C
D