3.2 Statistics, Distributions, and Risk Measures

Key Takeaways

  • Descriptive statistics summarize central tendency, dispersion, shape, and relative position of return data.
  • Standard deviation measures total variability; downside measures (semideviation, shortfall, VaR) isolate adverse outcomes.
  • Skewness and kurtosis describe shape and matter because asset returns are typically nonnormal with fat tails.
  • The normal distribution lets analysts compute probabilities via z-scores, using 68-95-99.7 and the 1.65/1.96/2.58 critical values.
Last updated: June 2026

Statistics, Distributions, and Risk Measures

Statistics turns raw observations (returns, yields, spreads, valuation multiples) into decision-useful information. A disciplined first pass asks four questions: What is typical? How much does it vary? Is the distribution symmetric? How fat are the tails?

Central tendency and position

Measures of central tendency locate the center. The arithmetic mean is the sum over the count. The median is the middle sorted value (the average of the two middle values when N is even) and is robust to outliers. The mode is the most frequent value. When data are skewed by extremes, the median often describes the typical outcome better. Quantiles divide sorted data: quartiles into four parts, quintiles into five, deciles into ten, and percentiles into a hundred. The interquartile range, Q3 minus Q1, underlies box plots.

Dispersion

Range is maximum minus minimum but uses only two points. Mean absolute deviation (MAD) averages absolute deviations from the mean. Variance averages squared deviations; standard deviation is its square root and is reported in return units, making it the workhorse risk measure. Critically, sample variance divides by n - 1 (a degrees-of-freedom adjustment because the sample mean was estimated from the same data), while population variance divides by N. Exam stems explicitly state whether the data are a sample or a population, and the denominator choice changes the answer.

The coefficient of variation (CV) is standard deviation / mean. It measures risk per unit of expected return, so a lower CV is preferred among investments with positive expected returns. CV becomes misleading when the mean is zero or negative.

Shape: skewness and kurtosis

Returns are frequently asymmetric. Positive (right) skew has a long right tail and pulls the mean above the median; negative (left) skew has a long left tail (more dangerous for investors) and pulls the mean below the median. For a perfectly symmetric distribution, mean = median = mode. Kurtosis describes tail thickness; the normal distribution has kurtosis of 3, so excess kurtosis is kurtosis minus 3. A leptokurtic distribution (positive excess kurtosis) has fatter tails and a sharper peak, producing more extreme outcomes than the normal model predicts.

Most equity-return series are negatively skewed and leptokurtic, which is why pure normal probabilities understate crash risk.

The normal distribution and z-scores

The normal distribution is symmetric, fully described by its mean and variance, has skewness of zero and excess kurtosis of zero, and is the basis of mean-variance analysis. A z-score standardizes any observation: z = (x - mean)/standard deviation. Memorize these landmarks:

  • About 68% of observations lie within +/- 1 standard deviation, 95% within +/- 2, and 99.7% within +/- 3.
  • One-tailed critical z-values: 1.65 for 5%, 2.33 for 1%.
  • Two-tailed critical z-values: 1.96 for 5% (so 90% interval uses 1.65, 95% uses 1.96, 99% uses 2.58).

A return of 14% drawn from a normal distribution with mean 8% and standard deviation 3% has z = (14 - 8)/3 = 2.0, placing it two standard deviations above the mean.

ConceptFormula or meaningCandidate use
Arithmetic meansum x / nAverage return
MedianMiddle sorted valueOutlier-robust center
Sample variancesum(x - xbar)^2 / (n-1)Total variability
Standard deviationsqrt(variance)Risk in return units
Coefficient of variations / meanRisk per unit of return
SkewnessDirection of long tailAsymmetry of outcomes
Excess kurtosisKurtosis minus 3Fat-tail / extreme-loss risk
z-score(x - mean)/sStandardized position

Downside risk measures

Standard deviation penalizes upside and downside equally, but investors fear losses. Semivariance uses only returns below the mean; target semideviation measures dispersion below a stated target. Shortfall risk is the probability that return falls below a threshold. Value at risk (VaR) estimates a minimum loss for a stated probability and horizon (for example, a 5% one-day VaR of 1 million means losses should exceed 1 million only 5% of days).

Match the measure to the decision: a pension with a required return cares about shortfall probability, while an option book with rare large losses needs skewness and kurtosis, not just standard deviation.

Worked dispersion example

Consider four annual returns: 4%, 8%, 12%, and 16%. The arithmetic mean is (4 + 8 + 12 + 16)/4 = 10%. The deviations from the mean are -6, -2, +2, and +6, whose squares sum to 36 + 4 + 4 + 36 = 80. As a sample, divide by n - 1 = 3 to get a variance of 80/3 = 26.67, so the sample standard deviation is sqrt(26.67) = 5.16%. As a population, divide by N = 4 to get a variance of 20 and a standard deviation of 4.47%. The coefficient of variation for the sample case is 5.16/10 = 0.52, meaning roughly half a unit of return risk per unit of expected return.

This single example shows why reading sample versus population in the stem is not optional: the same data yield two different standard deviations.

Exam tactics

On the exam, if the mean exceeds the median, think positive skew; if the tails are fat (positive excess kurtosis), normal probabilities understate extreme events such as crashes; and between two funds with equal means, the one with the lower standard deviation has lower total volatility. When a question gives a target return and asks for the chance of underperforming, compute a z-score relative to that target and read the normal table; when it asks which of two assets is riskier per unit of return, compare coefficients of variation rather than raw standard deviations, because scale differs across the two return series.

Test Your Knowledge

A return distribution has a mean greater than its median. The distribution is best described as:

A
B
C
D
Test Your Knowledge

A sample of five annual returns has a sample mean of 6%. When calculating the sample variance, the sum of squared deviations is divided by:

A
B
C
D
Test Your Knowledge

Returns are normally distributed with a mean of 8% and standard deviation of 3%. The probability of a return below 2% is closest to:

A
B
C
D