2.3 Forecasting
Key Takeaways
- Qualitative forecasting (Delphi, market research, executive opinion, sales-force composite) is used for new products or no/poor data; quantitative methods need historical data.
- Time-series methods (moving average, weighted moving average, exponential smoothing) project history forward; causal methods (regression) link demand to driver variables.
- Exponential smoothing forecast: Ft+1 = alpha x At + (1 - alpha) x Ft; a higher alpha makes the forecast react faster to recent demand and less stable.
- Forecast accuracy is measured with MAD and MAPE (magnitude of error) and bias is detected with a tracking signal (cumulative error / MAD); a tracking signal beyond about plus or minus 4 signals an out-of-control forecast.
- Forecasts are more accurate when aggregated (higher level, longer horizon group) and less accurate at the SKU/short-horizon level; always pair a forecast with a measure of its error.
Forecasting is the analytical heart of demand management, and the CSCP exam includes quantitative items here — you must be able to compute a moving average, apply exponential smoothing, and evaluate forecast error. Memorize the formulas and, more importantly, understand what each result tells a planner.
Principles of Forecasting
Four principles the exam repeatedly tests:
- Forecasts are almost always wrong — plan for error with safety stock and flexibility.
- Aggregated forecasts are more accurate than disaggregated ones — a product family forecasts better than a single SKU.
- Longer horizons are less accurate than shorter ones — accuracy decays with time.
- Every forecast should include an estimate of error — a number without an error band is incomplete.
Qualitative vs. Quantitative Methods
| Type | When to Use | Examples |
|---|---|---|
| Qualitative (judgmental) | New product, no/poor history, long-range, disruptive change | Delphi method, market research, executive opinion, sales-force composite |
| Quantitative | Sufficient, relevant historical data exists | Time-series (moving average, exponential smoothing), causal regression |
Qualitative methods rely on informed judgment. The Delphi method uses iterative, anonymous rounds of expert input to converge without dominant-personality bias — favored for long-range and new-technology forecasts. Quantitative methods assume the future is statistically related to the past (time-series) or to measurable drivers (causal).
A company is launching a completely new product category with no sales history and wants to combine expert opinions while avoiding the bias of dominant personalities in a group meeting. Which forecasting method is the BEST fit?
Time-Series Methods
Time-series methods assume demand is a function of its own history (level, trend, seasonality, cycle, and random noise).
Simple Moving Average
The forecast equals the average of the last n periods. It smooths noise but lags trend and reacts slowly.
Example: Demand for the last 3 months = 100, 120, 110. A 3-month simple moving average forecast for next month = (100 + 120 + 110) / 3 = 110 units.
Weighted Moving Average
Weights (summing to 1) give more importance to recent periods. Using the same data with weights 0.5 (most recent), 0.3, 0.2:
Forecast = (0.5 × 110) + (0.3 × 120) + (0.2 × 100) = 55 + 36 + 20 = 111 units. It responds faster to recent demand than the simple average.
Exponential Smoothing
Exponential smoothing is the most exam-relevant time-series method. The next forecast blends the last actual and the last forecast using a smoothing constant alpha (0–1):
F(t+1) = alpha × A(t) + (1 − alpha) × F(t)
where A(t) is actual demand in period t and F(t) is the forecast for period t.
Worked example: F(t) = 100, A(t) = 120, alpha = 0.2. F(t+1) = 0.2 × 120 + 0.8 × 100 = 24 + 80 = 104 units.
If alpha were 0.5: F(t+1) = 0.5 × 120 + 0.5 × 100 = 110 units — a larger alpha reacts faster to recent demand but produces a less stable, noisier forecast. Simple exponential smoothing handles level only; trend (Holt) or seasonality (Holt-Winters) extensions are needed when those patterns exist.
Causal (Associative) Models
Causal models forecast demand from one or more independent driver variables rather than from time alone. Linear regression fits demand (Y) to a predictor (X):
Y = a + bX
where a is the intercept and b is the slope (change in demand per unit change in the driver). Example drivers: advertising spend, price, housing starts, GDP, or weather. The strength of the relationship is summarized by the coefficient of determination (R-squared) — closer to 1 means the driver explains more of the demand variation.
Use causal models when a measurable, leading variable reliably drives demand and that driver can itself be predicted (otherwise you have just shifted the forecasting problem). Causal models can capture turning points that pure time-series extrapolation misses, which is why they complement time-series methods rather than always replacing them.
Using single exponential smoothing with alpha = 0.3, last period's forecast was 200 units and actual demand was 250 units. What is the forecast for the next period?
Measuring Forecast Error
A forecast is only useful with an error measure. For each period, error e = Actual − Forecast (a positive error means demand was under-forecast).
MAD — Mean Absolute Deviation
MAD = Σ |Actual − Forecast| / n — the average size of the error, ignoring sign. Lower is better; MAD is in the same units as demand and is easy to compute.
MAPE — Mean Absolute Percent Error
MAPE = (Σ |Actual − Forecast| / Actual) / n × 100% — the average percentage error. MAPE is unit-free, so it allows comparison across products of very different volumes.
Bias and the Tracking Signal
Bias is a persistent tendency to over- or under-forecast (errors do not cancel out). It is detected with the tracking signal:
Tracking Signal = Running Sum of Forecast Errors (RSFE) / MAD
A tracking signal near 0 means errors are balanced. A signal that drifts well beyond roughly ±4 indicates the forecast is out of control / biased and the model should be reviewed. A consistently large positive signal means chronic under-forecasting; a large negative signal means chronic over-forecasting.
Worked Forecast-Error Example
Four periods of actual demand and forecast:
| Period | Actual | Forecast | Error (A−F) | |Error| | |Error|/Actual |
|---|---|---|---|---|---|
| 1 | 100 | 90 | +10 | 10 | 10.0% |
| 2 | 120 | 110 | +10 | 10 | 8.3% |
| 3 | 90 | 100 | −10 | 10 | 11.1% |
| 4 | 110 | 105 | +5 | 5 | 4.5% |
| Σ | +15 | 35 | 33.9% |
MAD = Σ|Error| / n = 35 / 4 = 8.75 units.
MAPE = (Σ |Error|/Actual) / n = 33.9% / 4 ≈ 8.5%.
RSFE (running sum of errors) = +10 + 10 − 10 + 5 = +15.
Tracking signal = RSFE / MAD = 15 / 8.75 ≈ +1.7 — within ±4, so the forecast is acceptable and not significantly biased, though the positive sign hints at mild under-forecasting to watch.
Over six periods, a product's running sum of forecast errors (RSFE) is +52 units and its MAD is 10 units. What does the tracking signal indicate?
Aggregation and the Practical Forecasting Hierarchy
Forecast accuracy improves with aggregation because individual-item random variations partially cancel out (the risk-pooling / portfolio effect). Practical implications the CSCP exam tests:
- A product-family forecast is more accurate (lower relative error) than any single SKU within it.
- A regional or total forecast is more accurate than a single location's forecast.
- A monthly or quarterly forecast is more accurate than a daily forecast.
This is exactly why S&OP forecasts at the aggregate volume level and disaggregates into SKUs/locations only for execution. The well-designed planning approach: forecast at the level where accuracy is highest, then break down (disaggregate) to the detail needed, and always carry the error measure forward to size safety stock and buffers.
| Aggregation Lever | Effect on Accuracy |
|---|---|
| Higher product level (family vs. SKU) | More accurate |
| Wider geography (national vs. store) | More accurate |
| Longer time bucket (month vs. day) | More accurate |
| Shorter horizon (near vs. far future) | More accurate |
A planner observes that the forecast for a product family is consistently more accurate than the forecasts for the individual SKUs within that family. What principle BEST explains this?