9.5 Probability, Statistics, and the Normal Distribution
Key Takeaways
- The sample standard deviation uses n−1 in the denominator: s = √[Σ(x−x̄)²/(n−1)], where Σ(x−x̄)² is the sum of squared residuals.
- Random error follows the normal distribution; under the empirical 68-95-99.7 rule, about 68%, 95%, and 99.7% of values fall within 1, 2, and 3 standard deviations of the mean.
- The 50% (probable) error in surveying equals 0.6745·σ, the value historically reported as the 'probable error' of an observation.
- A confidence interval for the mean is x̄ ± (t or z)·(s/√n); the s/√n term is the standard error of the mean, which shrinks as more observations are taken.
Describing a Set of Observations
When a quantity is measured repeatedly, statistics summarize the scatter. Three measures of central tendency appear on the FS exam:
- Mean (arithmetic average):
x̄ = Σx / n. For equally weighted observations, the mean is the most probable value. - Median: the middle value when observations are sorted (or the average of the two middle values for an even count). It resists outliers.
- Mode: the most frequently occurring value.
Spread is captured by variance and standard deviation. The key FS detail is the n−1 denominator for a sample (Bessel's correction), which corrects the bias of estimating a population spread from a finite sample:
s = √[ Σ(xᵢ − x̄)² / (n − 1) ] and variance s² = Σ(xᵢ − x̄)² / (n − 1)
Worked standard deviation. Five distance measurements: 100.04, 100.02, 100.06, 100.01, 100.07 ft.
- Mean x̄ = (100.04+100.02+100.06+100.01+100.07)/5 = 500.20/5 = 100.040 ft
- Residuals v: 0.00, −0.02, +0.02, −0.03, +0.03; squared: 0, 0.0004, 0.0004, 0.0009, 0.0009; Σv² = 0.0026
- s = √(0.0026/(5−1)) = √(0.00065) = 0.0255 ft
Using n instead of n−1 would understate the spread, which is why FS problems specify the sample formula.
The Normal Distribution and the 68-95-99.7 Rule
Random errors of measurement cluster symmetrically about the true value, producing the bell-shaped normal (Gaussian) distribution. Its shape lets you state how often an observation should fall within a given band of the mean. The empirical (68-95-99.7) rule gives the standard benchmarks:
| Band around mean | Percent of observations |
|---|---|
| x̄ ± 1σ | ≈ 68.27% |
| x̄ ± 2σ | ≈ 95.45% |
| x̄ ± 3σ | ≈ 99.73% |
Two surveying-specific multipliers are worth memorizing. The probable error (50%) is E₅₀ = 0.6745·σ — half of all observations fall within ±0.6745σ. The commonly cited 95% (two-sigma) error uses about 1.96σ, often rounded to 2σ.
Worked normal reasoning. If a distance has σ = 0.030 ft, then about 68% of repeated readings lie within ±0.030 ft of the mean, about 95% within ±0.060 ft, and about 99.7% within ±0.090 ft. A reading 0.10 ft from the mean lies beyond 3σ and is statistically suspect — a likely blunder rather than ordinary scatter. This is exactly how the normal model supports quality control: it sets a defensible threshold for rejecting an observation instead of relying on a hunch.
Standard Error of the Mean and Confidence Intervals
A single mean computed from n observations is itself uncertain, but less uncertain than any one reading. The standard error of the mean (SEM) quantifies this:
σ_x̄ = σ / √n (estimated as s / √n)
Because of the √n, averaging more observations improves the mean's precision — but with diminishing returns. To halve the standard error you must take four times as many readings (√4 = 2).
Worked SEM. From the five distances above, s = 0.0255 ft, so the standard error of the mean is 0.0255/√5 = 0.0255/2.236 = 0.0114 ft. The mean (100.040 ft) is therefore far better determined than any individual reading.
A confidence interval brackets the true mean:
x̄ ± k · (s/√n)
For large samples or known σ, k is a normal multiplier (z = 1.96 for 95%); for small samples k is a Student's-t value with n−1 degrees of freedom. Using the example with a 95% z-multiplier: 100.040 ± 1.96·0.0114 = 100.040 ± 0.0224 ft, i.e., (100.018, 100.062). The interpretation: we are 95% confident the true distance lies in this band. Note the interval uses the standard error of the mean (s/√n), not the raw standard deviation s — confusing the two is a frequent FS trap.
Probability Basics and Surveying Error Multipliers
FS probability questions stay elementary. For independent events the multiplication rule gives P(A and B) = P(A)·P(B); for mutually exclusive events the addition rule gives P(A or B) = P(A) + P(B). A single measurement blunder that occurs with probability 0.02 independently on each of three sights gives a probability of at least one blunder of 1 − (1 − 0.02)³ = 1 − 0.9412 = 0.0588, about 5.9%. Recognizing 'at least one' as the complement of 'none' is the usual shortcut.
Surveying inherits a family of error multipliers built on the standard deviation, and the FS exam expects you to recognize them by name:
| Term | Multiplier on σ | Confidence captured |
|---|---|---|
| Standard error (one-sigma) | 1.000 | ≈ 68.3% |
| Probable error (50%) | 0.6745 | 50% |
| 90% error | 1.6449 | 90% |
| 95% error (two-sigma) | 1.9600 | 95% |
| 99.7% (three-sigma) | 3.0000 | 99.7% |
Worked probable error. With s = 0.030 ft, the probable error is 0.6745·0.030 = 0.0202 ft, meaning half of the observations should fall within ±0.020 ft of the mean. The same σ gives a 95% error of 1.96·0.030 = 0.059 ft. These multipliers let a surveyor translate a single computed standard deviation into whatever confidence level a specification demands. The deeper point reinforced throughout statistics on the FS exam: every stated tolerance is meaningless without its confidence level, because '±0.02 ft' at 50% and at 95% describe very different qualities of work.
For a normally distributed set of observations with standard deviation σ, approximately what percentage falls within ±2σ of the mean?
Four measurements of an angle have a sample standard deviation of 8.0 arc-seconds. What is the standard error of the mean?
Which formula correctly gives the sample standard deviation of a set of n repeated measurements?