4.3 Registries and Secondary Data Sources

Key Takeaways

  • Primary data come directly from the health record; secondary data are abstracted from it into registries, indices, and databases for a defined purpose.
  • A registry is organized by disease, condition, or event (case-based); an index is a list pointing back to records, organized by a data element such as disease or physician.
  • The cancer registry workflow is case finding, abstracting, staging, follow-up, and reporting to the state registry and the National Cancer Database (NCDB).
  • The Master Patient Index (MPI) is the permanent index linking every patient to their unique medical record number across the enterprise.
  • Disease and operation indices are organized by ICD-10-CM/PCS code; the cancer registry adds clinical detail, staging, and lifetime follow-up that an index does not.
Last updated: June 2026

Primary vs Secondary Data

Primary data are captured at the point of care and reside in the patient's health record (progress notes, orders, results). Secondary data are selected and abstracted from primary records into a separate, purpose-built data source — a registry, index, or database — to support a defined use such as research, quality measurement, or public health reporting.

Secondary sources are either patient-identifiable (a cancer registry that tracks named patients over time) or aggregate/de-identified (statistical summaries). The data abstracted is governed by data sets — UHDDS for inpatients, UACDS for ambulatory care, OASIS for home health, and MDS for long-term care — which standardize the elements collected so they are comparable across facilities.

Users of secondary data span the spectrum: clinicians studying outcomes, administrators measuring quality, accrediting and licensing bodies, public-health agencies, researchers, and payers. Because secondary data are abstracted by a person or a tool, they inherit any error in the primary record and can add abstraction error of their own — which is why registries audit a sample of cases for accuracy. The exam wants you to recognize that the health record is primary and everything compiled from it for a separate purpose is secondary, regardless of how sophisticated the destination system is.

Registry vs Index — the Core Distinction

RHIT tests this pairing repeatedly:

  • A registry is organized by case — by a disease, condition, or event. It collects rich, standardized data on each case and typically follows patients over time. Reporting is often legally mandated.
  • An index is a list or pointer organized by a single data element that lets you locate records. It does not follow patients longitudinally.
IndexOrganized byUse
Master Patient Index (MPI)Patient identityPermanent link of patient to unique medical record number
Disease indexICD-10-CM diagnosis codeFind all cases of a diagnosis
Operation indexICD-10-PCS / CPT procedure codeFind all cases of a procedure
Physician indexAttending/operating providerFind all cases by provider

The MPI is the most critical permanent index; duplicate or overlaid MPI entries are a major data-integrity problem because they fragment a patient's record. A duplicate is two records for the same patient; an overlay is one record erroneously holding two different patients' data — overlays are the more dangerous because they can put one patient's results into another's chart. In a multi-facility enterprise, an Enterprise MPI (EMPI) cross-links the local MPIs so a patient is recognized across sites.

Unlike a registry, none of these indices follows a patient longitudinally or carries clinical depth; each is simply a sorted pointer back into the records. That single difference — pointer vs longitudinal case file — is what most registry-versus-index questions turn on.

The Cancer (Tumor) Registry Workflow

The cancer registry is the registry RHIT covers in most depth. Its standardized workflow is:

  1. Case finding — identifying reportable cancers from pathology reports, the disease index, and other sources.
  2. Abstracting — recording demographic, diagnostic, staging, and treatment data on each case.
  3. Staging — classifying tumor extent (e.g., AJCC TNM staging) to describe how far disease has spread.
  4. Follow-up — contacting patients/providers at least annually to track recurrence and survival.
  5. Reporting — submitting cases to the state central cancer registry and, for accredited programs, to the National Cancer Database (NCDB), jointly sponsored by the American College of Surgeons Commission on Cancer (CoC) and the American Cancer Society.

The reference date is the point from which a registry includes all eligible cases going forward.

Other Registries and External Databases

Beyond cancer, several disease- and event-based registries are tested:

  • Trauma registry — tracks severely injured patients to support trauma-system planning and quality; uses injury severity scoring.
  • Immunization registry (IIS) — a population-wide record of vaccinations, often reportable to state systems.
  • Implant registry — tracks medical devices (e.g., joints, pacemakers) so patients can be located for recalls.
  • Birth defects and diabetes registries support surveillance.

External claims databases aggregate billing data (e.g., Medicare claims) and are a rich secondary source for utilization and outcomes analysis, but they reflect what was billed, not the full clinical picture. Remember the line: a registry adds clinical depth and longitudinal follow-up; an index simply points you back to records. Reporting to many registries is mandated by law, which is why their data quality and completeness are audited.

National aggregate sources round out the picture. The HCUP databases (Healthcare Cost and Utilization Project) and the National Hospital Discharge Survey compile de-identified discharge data for research and policy. Vital records — birth and death certificates filed with the state and rolled up by the National Center for Health Statistics (NCHS) — are another mandated secondary source.

The thread connecting all of them is that data are abstracted out of the primary record into a purpose-built store, and each carries a defined reporting obligation, retention rule, and quality-audit expectation that the RHIT is responsible for upholding. In short, the patient's chart is the single primary source, and every registry, index, claims file, and national survey built from it is a secondary data source whose accuracy depends on the quality of that original documentation.

Test Your Knowledge

What is the key difference between a disease registry and a disease index?

A
B
C
D
Test Your Knowledge

In the cancer registry workflow, which step involves identifying reportable cancer cases from sources such as pathology reports and the disease index?

A
B
C
D
Test Your Knowledge

Which secondary data source serves as the permanent index linking each patient to their unique medical record number across the enterprise?

A
B
C
D