3.5 Sampling, Estimation, and Hypothesis Testing

Key Takeaways

A statistic describes a sample, while a parameter describes the population.
Sampling distributions allow analysts to estimate uncertainty around sample statistics.
Confidence intervals combine a point estimate, reliability factor, and standard error.
Hypothesis tests evaluate evidence against a null hypothesis using a test statistic, rejection region, or p-value.

Last updated: May 2026

Sampling, Estimation, and Hypothesis Testing

Investment analysts rarely observe a full population. They work with samples of returns, fund performance, defaults, transactions, or survey responses. Inference is the process of using sample evidence to make statements about a population while acknowledging sampling error.

A parameter is a numerical feature of a population, such as the true mean return of a strategy. A statistic is a numerical feature of a sample, such as the average return observed over 60 months. Statistics vary from sample to sample, so estimates should be paired with uncertainty measures.

Sampling methods affect reliability. Simple random sampling gives each population member an equal chance of selection. Stratified random sampling samples within defined groups, which can improve representation. Time-series financial data require extra care because observations can be correlated across time.

The sampling distribution of the sample mean is the distribution of sample means across repeated samples. Its standard deviation is the standard error: s / sqrt(n) when the population standard deviation is unknown and sample standard deviation is used. Larger samples reduce standard error because more observations provide more information.

The central limit theorem says that, for a sufficiently large sample, the sampling distribution of the sample mean is approximately normal, even when the population is not normal, if observations are independent and identically distributed with finite variance. This is why normal and t tools appear so often in inference.

A confidence interval has three parts: point estimate, reliability factor, and standard error. The basic form is estimate +/- reliability factor x standard error. A wider interval reflects greater confidence, greater variability, or a smaller sample. A narrower interval reflects less uncertainty or lower confidence.

Inference tool	Core idea	Common Level I use
Standard error	`s / sqrt(n)`	Precision of sample mean
z-test	Known variance or large-sample setting	Mean or proportion test
t-test	Unknown variance with sample s	Mean test
Chi-square test	Count or variance setting	Independence or variance
F-test	Ratio of variances	Comparing variances
p-value	Smallest significance level for rejection	Evidence strength

Hypothesis testing begins with a null hypothesis, H0, and an alternative hypothesis, Ha. The null is the statement tested directly. A two-tailed test looks for a difference in either direction. A one-tailed test looks for evidence in a specified direction. The alternative determines the rejection region.

A test statistic compares the sample result with the hypothesized value in standard-error units. For a mean, t = (sample mean - hypothesized mean) / standard error when sample standard deviation is used. Reject the null if the test statistic falls in the rejection region or if the p-value is less than the significance level.

A Type I error is rejecting a correct null hypothesis. Its probability is the significance level, alpha. A Type II error is failing to reject a false null hypothesis. Test power is 1 - probability of Type II error. Lowering alpha reduces Type I risk but can increase Type II risk if sample size stays the same.

Tests of independence often use a chi-square statistic on a contingency table. The question is whether classification in one category is independent of classification in another. Expected counts are calculated from row totals, column totals, and grand total. Large differences between observed and expected counts support rejection.

Exam questions usually provide enough data to identify the test. If the question is about a mean with unknown variance, think t. If it is about independence in a table, think chi-square. If it asks whether evidence is statistically significant, compare p-value with alpha or compare the statistic with the critical value.

Test Your Knowledge

A sample has standard deviation 18% and 36 observations. The standard error of the sample mean is:

3.0%

0.5%

6.0%

Test Your Knowledge

Rejecting a correct null hypothesis is best described as:

Type I error

Type II error

Loss of test power

Test Your Knowledge

A researcher tests whether sector classification and credit rating category are independent in a contingency table. The most appropriate test is a:

Chi-square test

Paired t-test

One-sample z-test

Up Next

3.6 Regression, Simulation, and Big Data

Continue learning

CFA Level I Study Guide

1Chapter 1: Orientation, Official Sources, and Exam Strategy

2Chapter 2: Ethical and Professional Standards

3Chapter 3: Quantitative Methods

4Chapter 4: Economics

5Chapter 5: Financial Statement Analysis

6Chapter 6: Corporate Issuers

7Chapter 7: Equity Investments

8Chapter 8: Fixed Income

9Chapter 9: Derivatives

10Chapter 10: Alternative Investments

11Chapter 11: Portfolio Management

12Chapter 12: Integrated CFA Level I Review

13Chapter 13: Final Countdown, Results, and Next Steps

3.5 Sampling, Estimation, and Hypothesis Testing

Key Takeaways

Sampling, Estimation, and Hypothesis Testing