5.1 Data Displays and Summary Statistics

Key Takeaways

  • ACT Statistics & Probability questions often reward reading the display, units, and sample size before doing any arithmetic.
  • Mean uses every value and is pulled by outliers; median is the middle after sorting and is usually more resistant.
  • Spread can be tested with range, interquartile range, standard deviation language, or visual shape in a histogram, box plot, or dot plot.
  • For scatterplots and best-fit lines, interpret slope and intercept in the units of the situation, then decide whether prediction is interpolation or extrapolation.
Last updated: June 2026

Why data questions are fast if you read first

ACT's official Math Test Description places Statistics & Probability at 12-15% of the Math section and describes it as center, spread, data collection, bivariate relationships, probability, and sample spaces. That means ACT data questions are usually not about advanced statistics. They are about reading a display accurately, choosing the right statistic, and avoiding a small wording trap under a 50-minute clock.

Start every data item with three checks: what is being counted, what the units are, and whether the question asks for a number, a percent, a comparison, or a model. The calculator is allowed on ACT Math, but it cannot decide which column matters or whether an outlier should change your interpretation.

Core statistics vocabulary

ToolWhat it tells youACT Math trap
MeanSum divided by countOne extreme value can pull it strongly
MedianMiddle value after sortingYou must sort first; for an even count, average the two middle values
ModeMost frequent valueA data set can have no mode or more than one mode
RangeMaximum minus minimumUses only two values, so it misses most of the distribution
Interquartile rangeQ3 minus Q1Focuses on the middle half and ignores extremes
Standard deviationTypical spread from the meanLarger means more dispersed, not necessarily larger values

ACT's practice directions state that, unless otherwise stated, the word average means arithmetic mean. So if a question says average score, average speed, or average number of tickets, assume sum divided by count unless the item explicitly says median, mode, weighted average, or another measure.

Worked example: outlier and center

Suppose the donations at a small fundraiser are 12, 14, 15, 16, 18, and 55 dollars. The mean is (12 + 14 + 15 + 16 + 18 + 55) / 6 = 130 / 6, about 21.7 dollars. The median is the average of the third and fourth sorted values, (15 + 16) / 2 = 15.5 dollars.

Both numbers are valid, but they answer different questions. The mean tells the total divided evenly across all donors, so the 55-dollar donation matters a lot. The median tells the typical middle donation, so it is more resistant to the extreme donation. If an ACT question asks which statistic better represents a typical donor when one value is unusually high, the median is usually the stronger choice.

Reading displays without inventing patterns

A histogram groups values into intervals; do not treat each bar as one exact value. A dot plot or stem-and-leaf plot usually lets you reconstruct every value. A box plot shows median, quartiles, and extremes, but not the exact shape inside each quarter. A two-way table sorts observations into categories, so row totals, column totals, and conditional percentages must stay separate.

Scatterplots add a different skill: describe the relationship between two variables. A positive association rises from left to right, a negative association falls, and a weak association looks widely scattered. A best-fit line summarizes the trend, but it does not prove that one variable causes the other.

If a line of best fit is y = 2.1x + 14 for study hours x and Math score y, the slope means about 2.1 score points for each additional study hour in that data set. The intercept 14 is the predicted score at 0 study hours, but it may or may not be meaningful in context. Predicting at x = 6 is interpolation if the original data included nearby study-hour values; predicting at x = 40 from a data set that only ran from 0 to 10 hours is risky extrapolation.

Common ACT data traps

  • Percent bars and frequency bars are not the same; convert only when you know the total.
  • A larger standard deviation means more spread around the mean, not a higher mean.
  • The median of a combined group is not found by averaging the two group medians.
  • Correlation describes association; it does not establish cause.
  • Weighted averages must account for group sizes before dividing.

A clean data workflow is: read the title, read both axes or labels, mark the total sample size, choose the statistic, then calculate. Most missed ACT data questions come from skipping one of those five steps, not from lacking a formula.

Data collection and display choice

ACT questions can also ask whether a conclusion is supported by the way data were collected. A random sample is usually stronger than a volunteer sample because every member of the target population has a fair chance to appear. A survey of students who chose to answer an optional cafeteria poll may overrepresent students with strong opinions. A survey of every fifth student entering the cafeteria is not perfect, but it is closer to a systematic sample.

When choosing a display, match the display to the claim. Use a bar graph for categories, a histogram or dot plot for one quantitative variable, a box plot for comparing medians and spread across groups, and a scatterplot for two quantitative variables. If the question asks which display best shows association between study time and score, a pie chart is weak because it hides paired x-y data.

One final check: the scale on an axis can exaggerate differences. If two bars differ by only 3 points but the vertical axis starts at 90 instead of 0, the visual gap may look larger than the numerical gap. Trust the labeled values, not the drama of the picture.

Test Your Knowledge

The data set 5, 6, 7, 8, 9 is changed by adding 40. Which statement best describes the effect of adding 40?

A
B
C
D
Test Your Knowledge

One class of 20 students has an average score of 84. Another class of 10 students has an average score of 72. What is the combined average score for all 30 students?

A
B
C
D
Test Your Knowledge

A best-fit line for a table is y = 3x + 12, where x is the number of weeks and y is the total number of practice problems completed. What does the slope represent?

A
B
C
D