3.5 Sampling, Estimation, and Hypothesis Testing

Key Takeaways

  • A statistic describes a sample, while a parameter describes the population.
  • Sampling distributions allow analysts to estimate uncertainty around sample statistics.
  • Confidence intervals combine a point estimate, reliability factor, and standard error.
  • Hypothesis tests evaluate evidence against a null hypothesis using a test statistic, rejection region, or p-value.
Last updated: May 2026

Sampling, Estimation, and Hypothesis Testing

Investment analysts rarely observe a full population. They work with samples of returns, fund performance, defaults, transactions, or survey responses. Inference is the process of using sample evidence to make statements about a population while acknowledging sampling error.

A parameter is a numerical feature of a population, such as the true mean return of a strategy. A statistic is a numerical feature of a sample, such as the average return observed over 60 months. Statistics vary from sample to sample, so estimates should be paired with uncertainty measures.

Sampling methods affect reliability. Simple random sampling gives each population member an equal chance of selection. Stratified random sampling samples within defined groups, which can improve representation. Time-series financial data require extra care because observations can be correlated across time.

The sampling distribution of the sample mean is the distribution of sample means across repeated samples. Its standard deviation is the standard error: s / sqrt(n) when the population standard deviation is unknown and sample standard deviation is used. Larger samples reduce standard error because more observations provide more information.

The central limit theorem says that, for a sufficiently large sample, the sampling distribution of the sample mean is approximately normal, even when the population is not normal, if observations are independent and identically distributed with finite variance. This is why normal and t tools appear so often in inference.

A confidence interval has three parts: point estimate, reliability factor, and standard error. The basic form is estimate +/- reliability factor x standard error. A wider interval reflects greater confidence, greater variability, or a smaller sample. A narrower interval reflects less uncertainty or lower confidence.

Inference toolCore ideaCommon Level I use
Standard errors / sqrt(n)Precision of sample mean
z-testKnown variance or large-sample settingMean or proportion test
t-testUnknown variance with sample sMean test
Chi-square testCount or variance settingIndependence or variance
F-testRatio of variancesComparing variances
p-valueSmallest significance level for rejectionEvidence strength

Hypothesis testing begins with a null hypothesis, H0, and an alternative hypothesis, Ha. The null is the statement tested directly. A two-tailed test looks for a difference in either direction. A one-tailed test looks for evidence in a specified direction. The alternative determines the rejection region.

A test statistic compares the sample result with the hypothesized value in standard-error units. For a mean, t = (sample mean - hypothesized mean) / standard error when sample standard deviation is used. Reject the null if the test statistic falls in the rejection region or if the p-value is less than the significance level.

A Type I error is rejecting a correct null hypothesis. Its probability is the significance level, alpha. A Type II error is failing to reject a false null hypothesis. Test power is 1 - probability of Type II error. Lowering alpha reduces Type I risk but can increase Type II risk if sample size stays the same.

Tests of independence often use a chi-square statistic on a contingency table. The question is whether classification in one category is independent of classification in another. Expected counts are calculated from row totals, column totals, and grand total. Large differences between observed and expected counts support rejection.

Exam questions usually provide enough data to identify the test. If the question is about a mean with unknown variance, think t. If it is about independence in a table, think chi-square. If it asks whether evidence is statistically significant, compare p-value with alpha or compare the statistic with the critical value.

Test Your Knowledge

A sample has standard deviation 18% and 36 observations. The standard error of the sample mean is:

A
B
C
Test Your Knowledge

Rejecting a correct null hypothesis is best described as:

A
B
C
Test Your Knowledge

A researcher tests whether sector classification and credit rating category are independent in a contingency table. The most appropriate test is a:

A
B
C