A test produces very similar scores across two administrations but has little evidence for predicting the referral outcome. Which statement is best?

The test may be reliable but not valid for that use.. Reliability is consistency, but validity requires evidence that the score supports the intended interpretation and decision.

A screening tool has strong sensitivity but the condition is uncommon in the referral population. What should the psychologist remember?

Base rates can reduce the likelihood that a positive screen is a true case.. Predictive values depend on base rates, so positive screens for uncommon conditions require careful follow-up before diagnosis.

Which factor most directly weakens the interpretation of a norm-referenced score?

The norm group does not match the examinee's language and background.. Norm-referenced scores depend on appropriate comparison groups, so mismatched norms can make conclusions unfair or inaccurate.

Psychometric Foundations: Reliability, Valid | Free Guide 2026

Psychometrics as Clinical Risk Management

Psychometrics is the science that keeps assessment from becoming impressionistic. In practice, every score is a sample of behavior obtained under specific conditions. A psychologist must ask whether the score is consistent enough to matter, whether the interpretation is supported, and whether the score should influence the decision at hand.

Reliability refers to consistency. Test-retest reliability concerns stability over time. Interrater reliability concerns agreement among raters. Internal consistency concerns whether items on a scale measure a related construct. Alternate-form reliability concerns consistency across equivalent versions. Low reliability increases error and weakens confidence in individual decisions.

Validity refers to evidence for interpretation and use. Content evidence asks whether the test samples the relevant domain. Criterion-related evidence asks whether scores relate to an outcome or external standard. Construct evidence asks whether the test behaves as theory predicts. Consequential considerations ask whether use of the test creates predictable harms or benefits in a setting.

A test can be highly reliable and still invalid for a particular purpose. A depression scale may consistently measure current distress, but it may not validly determine parenting capacity, malingering, neurocognitive impairment, or workplace safety by itself. The EPPP answer is often the one that limits conclusions to the available evidence.

Concept	Meaning	Practice implication
Reliability	Consistency of measurement	Lower reliability means wider uncertainty
Validity	Support for interpretation and use	Evidence must match the referral question
Norms	Comparison group for scores	Norm group must fit age, language, culture, and context
Standard error	Expected score imprecision	Interpret ranges and confidence, not only point scores
Base rate	How common a condition is in the population	Rare conditions create more false positives when screens are broad

Norms are not decoration. A score is meaningful only against an appropriate reference group or criterion. Age, education, language proficiency, disability, acculturation, medical status, and setting can change the meaning of performance. When norms are mismatched, the psychologist should qualify the interpretation, seek better instruments, consult, or use converging evidence.

The standard error of measurement reminds candidates that observed scores are imperfect estimates. A single score should not be treated as exact, especially near a cut point or when the decision has high stakes. Confidence intervals, behavioral observations, history, and collateral data help keep the conclusion proportional to the evidence.

Sensitivity and specificity are common in screening and diagnostic assessment. Sensitivity concerns the ability to identify people who have the condition. Specificity concerns the ability to identify people who do not have it. Positive predictive value and negative predictive value depend on base rates, so the same test can perform differently in specialty clinics, community samples, and forensic settings.

Validity indicators and response style measures also require careful interpretation. Overreporting, underreporting, inconsistency, defensiveness, random responding, fatigue, low literacy, misunderstanding, and cultural mismatch can all affect scores. The answer is not automatically to discard all data. The answer is to interpret cautiously, document limits, and seek additional evidence when the referral question remains important.

For exam scenarios, follow this checklist:

Identify the construct and the decision being made.
Ask what reliability evidence matters for that decision.
Ask what validity evidence supports the proposed interpretation.
Check whether the norm group and language fit the examinee.
Consider error, base rates, and response style.
Integrate scores with interview, observation, records, and collateral data.

Psychometric competence protects clients and institutions from overconfidence. It also improves communication. A clear report explains what a score supports, what it does not support, how much uncertainty remains, and what additional data would change the conclusion.

EPPP Study Guide

5.2 Psychometric Foundations: Reliability, Validity, Norms, and Error

Key Takeaways

Psychometrics as Clinical Risk Management

EPPP Study Guide

1Orientation: EPPP Two-Part Exam, Eligibility, Fees, Authorization, Scoring, and Retakes

2Part 1 and Part 2 Domain Map, Pretest Items, Pacing, and Study Strategy

3Biological Bases and Cognitive-Affective Bases of Behavior

4Social/Cultural Bases and Growth/Lifespan Development

5Assessment and Diagnosis: Psychometrics, Differential Diagnosis, and Communication

6Treatment, Intervention, Prevention, Consultation, and Supervision Knowledge

7Research Methods, Statistics, and Evidence-Based Practice

8Ethical, Legal, and Professional Issues for Part 1

9Part 2 Skills I: Scientific Orientation, Assessment, and Intervention

10Part 2 Skills II: Relational Competence, Professionalism, and Ethical Practice

11Part 2 Skills III: Collaboration, Consultation, Supervision, and Systems Practice

12Final EPPP Review: Test Day, Results, Score Transfer, and Licensure Next Steps

5.2 Psychometric Foundations: Reliability, Validity, Norms, and Error

Key Takeaways

Psychometrics as Clinical Risk Management