Skip to content

Data Sources Reference

Complete reference for supported data sources and their column mappings.

Supported Sources

Source Database Status Connection
IQVIA Pharmetrics Redshift Supported RedshiftConnection
Optum DOD Redshift Planned RedshiftConnection
Komodo Snowflake Planned SnowflakeConnection
Loopback Redshift Planned RedshiftConnection
MDV (Japan) Redshift Planned RedshiftConnection

Column Mappings

Patient Identifiers

Logical Name IQVIA Optum Komodo
patient_id pat_id patid patient_id

Service Dates

Logical Name IQVIA Optum Komodo
service_date from_dt svcdate service_date
service_end_date to_dt tsvcdat service_end_date
month_id month_id eligeff month_id

Diagnosis Codes

Logical Name IQVIA Optum Komodo
diagnosis diag1-diag12, diag_admit diag1-diag5 dx1-dx5

IQVIA Diagnosis Columns

IQVIA has 13 diagnosis columns: diag1 through diag12 plus diag_admit. By default, all positions are searched.

Medication Codes

Logical Name IQVIA Optum Komodo
ndc ndc ndc ndc
procedure_code proc1 proc1 procedure_code

Place of Service

Logical Name IQVIA Optum Komodo
place_of_service pos pos pos
record_type rectype tos claim_type

Demographics

Logical Name IQVIA Optum Komodo
year_of_birth der_yob yrdob birth_year
sex der_sex gdr_cd gender
payer_type pay_type bus payer_type

Table Patterns

Claims Tables

Source Pattern Example
IQVIA claims_{year} claims_2024
Optum medical_{year} medical_2024
Komodo claims_{year} claims_2024

Enrollment Tables

Source Pattern Example
IQVIA enroll2_{year} enroll2_2024
Optum member_{year} member_2024
Komodo enrollment_{year} enrollment_2024

Lookup Tables (IQVIA)

Table Description
pp_rx_lookup NDC to generic name mapping
pp_dx_lookup Diagnosis code descriptions
pp_pr_lookup Procedure code descriptions
pp_pos_lookup Place of service codes
pp_rev_lookup Revenue codes

Record Types

IQVIA rectype

Code Description
M Medical claim
F Facility claim
P Pharmacy claim
X Professional claim

Optum tos (Type of Service)

Code Description
1 Inpatient
2 Outpatient
3 Professional
4 Other

Place of Service Codes

Common values (standardized across sources):

Code Description
11 Office
21 Inpatient Hospital
22 Outpatient Hospital
23 Emergency Room
31 Skilled Nursing
81 Independent Laboratory

Sex Codes

IQVIA der_sex

Code Description
M Male
F Female
U Unknown

Optum gdr_cd

Code Description
M Male
F Female

Komodo gender

Code Description
M Male
F Female
U Unknown

Payer Type Codes

IQVIA pay_type

Code Description Category
1 Commercial COMMERCIAL
2 Medicare MEDICARE
3 Medicaid MEDICAID
4 Self-pay OTHER
5 Other OTHER

Optum bus

Code Description Category
COM Commercial COMMERCIAL
MCR Medicare MEDICARE
MCD Medicaid MEDICAID

Using in Code

from alx_heor.config import get_source_config

# Get all column mappings
config = get_source_config("iqvia")

# Access specific column
patient_col = config["columns"]["patient_id"]  # 'pat_id'
date_col = config["columns"]["service_date"]   # 'from_dt'

# Access table pattern
claims_pattern = config["claims_table_pattern"]  # 'claims_{year}'
table_2024 = claims_pattern.format(year=2024)    # 'claims_2024'

Source-Specific Notes

IQVIA

  • Uses der_yob (derived year of birth) - already cleaned
  • der_sex is validated (M/F/U only)
  • Claims tables partitioned by year
  • Enrollment in enroll2_{year} (not enroll_{year})

IQVIA Patient ID Pitfall

IQVIA has two patient ID columns that can cause confusion:

  • pat_id: Primary identifier in claims tables (claims_YYYY) and enrollment tables (enroll2_YYYY). Always use this for joins.
  • iq_patient_id: Secondary identifier found only in the enroll table (not enroll2_YYYY). Legacy code sometimes joins claims to enroll via pat_id to get demographics, then groups by iq_patient_id for patient-level analysis.

When joining claims to enrollment data, always use pat_id. The iq_patient_id column does not exist in claims tables.

Optum

  • Uses yrdob directly from source
  • gdr_cd only has M/F (no unknown)
  • eligeff used as month identifier for enrollment

Komodo

  • Column names are more descriptive (e.g., patient_id not pat_id)
  • Requires Snowflake connection
  • Different authentication method