5.2 Test Estimation and Prioritization
Key Takeaways
- Test estimation predicts the work needed to meet test objectives and should state its assumptions, exclusions, and uncertainty.
- Large testing tasks are usually estimated better after decomposition into smaller tasks.
- Metrics-based techniques (ratios, extrapolation, burndown) use historical or current data; expert-based techniques (Wideband Delphi, Planning Poker, three-point estimation) use judgment.
- Three-point estimation uses E = (a + 4m + b) / 6 and standard deviation SD = (b - a) / 6.
- Test case prioritization may be risk-based, coverage-based, additional-coverage-based, or requirements-based, but dependencies and resources can override the order.
Estimating Test Effort
Test effort estimation predicts how much work is needed to meet the test objectives. The estimate may be expressed in hours, person-days, story points, team capacity, calendar duration, or cost. A good estimate is not just a number: it also states assumptions, constraints, confidence, and what is excluded. Estimates are predictions, so they always carry uncertainty.
Estimation is easier for small tasks than large ones. "Test the checkout service" is too broad. Better tasks: review acceptance criteria, design boundary tests for discounts, prepare payment test data, execute happy-path and negative tests, run regression, and retest fixed defects. Decomposition makes hidden work visible and shrinks estimation error.
The v4.0 syllabus groups techniques into two families: metrics-based (using historical or current data) and expert-based (using the judgment of the task owners or experts).
Estimation Techniques
| Technique | Family | Main idea |
|---|---|---|
| Estimation based on ratios | Metrics-based | Apply historical ratios, e.g. development effort to test effort |
| Extrapolation | Metrics-based | Use early current-project measurements to forecast remaining effort |
| Burndown / velocity | Metrics-based | Use remaining work and team velocity to forecast |
| Wideband Delphi | Expert-based | Experts estimate independently, discuss outliers, repeat to consensus |
| Planning Poker | Expert-based | Delphi variant using story-point cards in Agile teams |
| Three-point estimation | Expert-based | Combine optimistic, most likely, and pessimistic values |
Ratio-based estimation works best when the organization has reliable history from similar projects. If past teams spent 2 test days per 3 development days, 600 development days suggests about 400 test days. The trap is assuming the ratio still holds when product, team, risk, tools, or quality expectations have changed.
Extrapolation suits iterative development. If the last three iterations needed 45, 50, and 55 hours of test work for comparable volume, the team may forecast about 50 hours next iteration. It weakens when the next iteration has unusual risks, new technology, or new dependencies.
Worked Three-Point Example
Three-point estimation uses an optimistic value (a), a most likely value (m), and a pessimistic value (b). The common weighted formula is:
E = (a + 4m + b) / 6 and SD = (b - a) / 6.
Worked example: a task is estimated a = 6, m = 9, b = 18 hours.
- E = (6 + 4*9 + 18) / 6 = (6 + 36 + 18) / 6 = 60 / 6 = 10 hours.
- SD = (18 - 6) / 6 = 12 / 6 = 2 hours.
So the estimate is 10 hours with a spread of about 2 hours, signalling moderate uncertainty driven by the wide pessimistic value. The 4*m weighting means the most likely value dominates, while a and b widen or narrow the confidence band. This both gives a number and communicates risk to stakeholders.
Wideband Delphi is structured expert estimation: experts estimate independently first so one loud voice does not anchor the group, then discuss large differences, clarify assumptions, and re-estimate. Planning Poker is the common Agile variant. The value is not only the final number but the shared understanding produced by debating uncertainty.
Prioritizing Test Cases
Once test cases and procedures are ready, they are ordered for execution. Prioritization helps find important failures earlier and gives stakeholders useful information sooner. This matters when time is limited, builds are unstable, environments are shared, or some defects block later testing.
| Prioritization strategy | Orders tests by |
|---|---|
| Risk-based | Tests covering the most important product risks first |
| Coverage-based | Tests achieving the highest coverage first |
| Additional-coverage | First the highest-coverage test, then each next test adding the most new coverage |
| Requirements-based | Stakeholder-assigned requirement priority |
Risk-based prioritization runs tests for the highest-impact, most-likely-to-fail risks first; a payment authorization path runs before a cosmetic preference. Coverage-based prioritization runs the highest-coverage tests first (statement, branch, requirements, or interface coverage). Additional-coverage prioritization picks the highest-coverage test, then repeatedly the test adding the most coverage not yet achieved. Requirements-based prioritization follows business priority, e.g. legal consent capture before profile image cropping.
Dependencies and Resource Limits
Priority is not the only input to execution order. A high-priority refund test may need a completed purchase test first. A performance suite may require a load environment available only overnight. A security specialist may be available for just one day. In those cases the realistic schedule must balance priority, dependency, and resources.
The exam often presents attractive but incomplete choices. "Always execute highest priority first" sounds right but dependencies can make it impossible. "Always automate first" is too broad. The CTFL answer considers risk, coverage, requirements, dependencies, resources, and the objective of obtaining useful feedback early.
Exam Traps
- An estimate without stated assumptions and uncertainty is incomplete; estimates are predictions, not commitments to an exact number.
- Ratio-based and extrapolation are metrics-based; Wideband Delphi, Planning Poker, and three-point are expert-based. Do not mix the families.
- Remember E = (a + 4m + b) / 6: the most likely value is weighted by four, not equally with a and b.
- Decomposition improves estimates; very large tasks are estimated poorly.
- Prioritization order can be overridden by dependencies and resource availability, so the "best" answer is rarely "strictly highest priority first."
Experts estimate a testing task as 4 hours optimistic, 7 hours most likely, and 16 hours pessimistic. Using E = (a + 4m + b) / 6, what is the estimate?
An organization forecasts test effort by applying the historical ratio of test days to development days from past similar projects. Which estimation technique is this?
Which factors can legitimately affect test execution order?
Select all that apply