Facilitators deliver the same curriculum very differently across three sites. What should be reviewed first?

Implementation fidelity and facilitator training. Variation in delivery is a fidelity problem. Reviewing fidelity procedures and facilitator training protects the ability to attribute any later change to the program, which inconsistent delivery would otherwise undermine.

Which question is a process (implementation) evaluation question rather than an outcome question?

Was each scheduled session delivered to the intended audience?. Process evaluation examines whether the program was delivered as planned, to whom, and how well. The other options measure changes in health indicators or behavior, which are impact or outcome questions.

A team discovers no baseline data were collected before the program began. What is the best next step?

Choose an evaluation design that answers a realistic question and state the limitation. Without baseline data a before-and-after impact claim is not defensible. The ethical, professional move is to select a feasible design, such as a post-only knowledge assessment, and report the limitation honestly rather than overstating effects.

Implementation-to-Evaluation Chain — Free Study Guide 2026

10.3 Implementation-to-Evaluation Chain

When a program is running, CHES scenarios shift to Area III, Implementation, and Area IV, Evaluation and Research. The recurring question is whether the program was delivered as intended and whether the evaluation can actually answer the question being asked.

Implementation monitoring vocabulary

Process monitoring uses precise terms that the exam tests directly.

Term	Definition
Fidelity	Degree the program is delivered as designed
Reach	Proportion of the intended population that participated
Dose delivered	Amount of program staff actually provided
Dose received	Extent participants engaged with the program
Adaptation	Planned, documented changes that preserve core components

When staff deliver a curriculum differently at each site, the first review target is fidelity and facilitator training, not new outcome claims. Variation undermines the ability to attribute any later change to the program.

Process versus impact versus outcome evaluation

The exam expects three layers, not two:

Process (formative/implementation) evaluation answers "Was the program delivered as planned, to whom, and how well?" It uses the fidelity, reach, and dose measures above.
Impact evaluation answers "Did knowledge, attitudes, skills, or behavior change in the short term?"
Outcome evaluation answers "Did the long-term health indicator—morbidity, mortality, quality of life—change?"

A scenario asking "Did blood pressure improve?" is an outcome question; "Did participants' nutrition knowledge rise immediately after the class?" is impact; "Was every session delivered?" is process. Participant satisfaction is a process measure and does not prove any behavioral result, even when ratings are glowing. Mislabeling these layers is one of the most common ways candidates lose evaluation items.

Evaluation designs and their evidence strength

Design	Structure	Causal strength
Post-only	Measure after the program	Weakest
Pre-post (one group)	Measure before and after	Moderate; no control
Quasi-experimental	Comparison group, no randomization	Stronger
Experimental (RCT)	Randomized control group	Strongest

The stronger the design, the more confidently change can be attributed to the program rather than to history, maturation, or selection. Real community programs often cannot randomize, so the exam rewards selecting the strongest feasible design and stating its limits.

Data collection methods and quality

Area IV scenarios also test how data are collected and judged. Quantitative methods (surveys, pre/post tests, biometric screenings) answer "how much" and "how many"; qualitative methods (interviews, focus groups, observation) answer "why" and "how." Strong evaluation mixes both. Two quality concepts recur: validity (the instrument measures what it claims to measure) and reliability (it measures consistently across raters and time).

A scenario describing a survey that participants interpret inconsistently is flagging a reliability problem; one measuring satisfaction but claiming it captures behavior change is flagging a validity problem. The exam expects you to name the right method for the question—qualitative interviews to explain low attendance, a comparison-group design to test effectiveness—rather than defaulting to whatever data are easiest to gather.

Why indicators and baselines matter

An outcome claim is only defensible if indicators and baseline data were defined during planning. If a team realizes no baseline was collected, the honest next step is to choose an evaluation design that can still answer a realistic question (for example, a post-only design comparing knowledge against a validated standard) and to state the limitation, not to assert impact anyway. This is where Area IV and Area VIII overlap, because overclaiming impact is both a methods error and an ethics violation.

Worked scenario

Attendance at a nutrition series is far below target. The weak answer blames participants. The strong answer treats low reach as an implementation and access issue: examine recruitment channels, scheduling, transportation, childcare, language, and cultural fit. Fixing access often recovers dose received; blaming participants ends inquiry and ignores equity. In a second item, two sites show different blood-pressure results, but a fidelity check reveals one site dropped the home-monitoring component. The correct interpretation ties the outcome difference to an implementation gap, not to a flawed curriculum.

Common traps

Satisfaction-as-outcome trap: using "participants liked it" to claim behavior change.
Claiming impact with no baseline or undefined indicators.
Ignoring fidelity when delivery varies across staff or sites.
Mislabeling a process question as outcome (or the reverse) on the exam.
Attributing weak results to participants when access barriers reduced reach.

Use the cycle note here too: if the missing evidence is baseline data or fidelity records, the next best step lives in evaluation design and monitoring, not in new program activities.

Finally, remember that evaluation feeds back into the cycle. Findings should inform the next round of planning, justify continued funding, and be shared with stakeholders in plain language. A program that evaluates but never uses the results has wasted the effort. On the exam, the strongest evaluation answer not only measures the right thing with a feasible design but also closes the loop by reporting honestly and recommending concrete, feasible improvements that are clearly grounded in what the collected data actually showed.

CHES Study Guide

CHES

10.3 Implementation-to-Evaluation Chain

Key Takeaways

10.3 Implementation-to-Evaluation Chain

Implementation monitoring vocabulary

Process versus impact versus outcome evaluation

Evaluation designs and their evidence strength

Data collection methods and quality

Why indicators and baselines matter

Worked scenario

Common traps

CHES Study Guide

1Orientation: Current Exam Format, Eligibility, Scoring, and Study Model

2Area I: Assessment of Needs and Capacity

3Area II: Planning: Goals, Objectives, Theory, and Program Design

4Area III: Implementation: Strategies, Learning Methods, and Fidelity

5Area IV: Evaluation and Research: Designs, Measures, and Data Use

6Area V: Advocacy: Policy, Coalitions, and Systems Change

7Area VI: Communication: Health Literacy, Media, Risk, and Audience Fit

8Area VII: Leadership and Management: Resources, Partnerships, and Quality

9Area VIII: Ethics and Professionalism: Credential Use and Confidentiality

10Cross-Area Scenarios: Program Cycle Integration and Decision-Making

11Timed Exam Strategy: Two-Block Workflow, Diagnostics, and Remediation

12Final Review: Recertification, Career Paths, and Next Steps

CHES

10.3 Implementation-to-Evaluation Chain

Key Takeaways

10.3 Implementation-to-Evaluation Chain

Implementation monitoring vocabulary

Process versus impact versus outcome evaluation

Evaluation designs and their evidence strength

Data collection methods and quality

Why indicators and baselines matter

Worked scenario

Common traps