Skip to content

HEOR & RWE Glossary

This glossary defines key terminology used throughout the alx_heor library and in Real-World Evidence (RWE) / Health Economics and Outcomes Research (HEOR) studies.


Study Design Terms

Index Date

The anchor date for a patient's study timeline. All baseline and follow-up periods are calculated relative to this date. Typically the first (or second) qualifying diagnosis date.

Used in: cohort.get_cohort(), claims.get_index_dates()

Baseline Period

The time window before the index date used to assess patient characteristics, comorbidities, and exclusion criteria. Common durations: 6 months, 12 months.

Example: A 6-month baseline period means examining claims from index_date - 180 days to index_date - 1 day.

Follow-up Period

The time window after the index date for outcome assessment. Patients must remain observable (enrolled) during this period for valid outcome measurement.

Example: A 12-month follow-up means observing patients from index_date to index_date + 365 days.

Attrition

The reduction in sample size as sequential inclusion and exclusion criteria are applied. Tracked in an "attrition table" showing patient counts at each step.

Why it matters: Attrition tables are required for study transparency and reproducibility. Large drops at specific criteria may indicate data quality issues or overly restrictive criteria.

Used in: cohort.get_cohort() returns attrition tracking via CohortResult.summary()


Enrollment & Observability

Continuous Enrollment

A requirement that patients have uninterrupted insurance coverage for a specified period (e.g., 6 months before index + 12 months after). Small gaps (1 month) are often allowed due to data processing delays.

Used in: enrollment.filter_continuous_enrollment(), cohort.EnrollmentCriteria

Enrollment Gap

A period where a patient has no enrollment record in the database. Large gaps (>1-3 months) suggest the patient left the database and cannot be observed.

Used in: enrollment.calculate_enrollment_gaps()

Censoring

In survival analysis, patients are "censored" when they become unobservable (enrollment loss or study end). The censoring date marks when follow-up ends, not necessarily when the event of interest occurred.

Types: - Right censoring: Patient leaves observation before event occurs (most common) - Administrative censoring: Follow-up ends at study end date

Used in: enrollment.get_censor_dates()


Diagnosis & Procedure Codes

ICD-9 / ICD-10

International Classification of Diseases coding systems for diagnoses.

  • ICD-9: Used in the US before October 1, 2015
  • ICD-10: Used in the US from October 1, 2015 onward

Pitfall: For studies spanning the transition date, include BOTH ICD-9 and ICD-10 codes for your target conditions.

Example: Myasthenia Gravis = 358.0 (ICD-9) and G700, G7000, G7001 (ICD-10)

Used in: claims.get_claims(), cohort.DiagnosisCriteria

CPT / HCPCS Codes

Procedure coding systems for medical services.

  • CPT (Current Procedural Terminology): 5-digit codes for medical procedures
  • HCPCS (Healthcare Common Procedure Coding System): Includes J-codes for drugs

Medication Coding

NDC (National Drug Code)

An 11-digit identifier for pharmaceutical products at the manufacturer/packager level. Found in pharmacy claims (retail and mail-order prescriptions).

Format: 5-4-2 segments (labeler-product-package)

Example: 00074-3799-01 = Humira 40mg pen

Pitfall: NDC codes change frequently with new manufacturers and repackagers. Always verify codes against current FDA database.

Used in: medications.lookup_medications()

J-code

A subset of HCPCS Level II codes identifying drugs administered by healthcare providers (infusions, injections). Found in medical claims, not pharmacy claims.

Format: J + 4 digits (e.g., J1300 for eculizumab)

Why it matters: Expensive specialty drugs (biologics) are often physician-administered and appear as J-codes rather than NDC codes.

Used in: medications.lookup_medications()


Medication Adherence

PDC (Proportion of Days Covered)

The standard medication adherence metric in pharmacoepidemiology.

Formula: PDC = (Days with medication available) / (Observation period)

Thresholds: - PDC >= 0.80 (80%): Adherent - PDC 0.50-0.79: Partial adherence - PDC < 0.50: Non-adherent

Used in: medications.calculate_pdc()

Treatment Episode

A contiguous period of treatment separated by gaps. A gap > 45 days typically indicates treatment discontinuation and the start of a new episode.

Used in: medications.identify_treatment_episodes()


Data Sources

Claims Data

Healthcare encounter records captured through insurance billing. Includes diagnoses, procedures, prescriptions, and dates of service. The foundation of retrospective database studies.

Strengths: Large sample sizes, longitudinal follow-up, real-world practice patterns

Limitations: No clinical notes, lab values limited, coding errors possible

RWE (Real-World Evidence)

Evidence derived from observational healthcare data (claims, EHR, registries) rather than randomized controlled trials. Used for effectiveness studies, safety surveillance, and health economics analyses.

HEOR (Health Economics and Outcomes Research)

A field combining clinical outcomes research with health economics. Studies often use RWE to assess comparative effectiveness, cost-effectiveness, and disease burden.


Data Source-Specific Terms

IQVIA Pharmetrics

A US commercial claims database covering approximately 150 million patients. Hosted on Amazon Redshift.

Key columns: pat_id, from_dt, diag1-diag12

Optum DOD

A US commercial + Medicare claims database with detailed clinical and cost data. Hosted on Amazon Redshift.

Key columns: patid, svcdate, diag1-diag5

pat_id vs iq_patient_id

IQVIA-specific patient ID confusion:

  • pat_id: Primary identifier in claims and enrollment tables. Always use this for joins.
  • iq_patient_id: Secondary identifier in the enroll table only. Legacy code may use this for grouping after joining to get demographics.

See: Enrollment module notes


See Also