Business Context
A state public-health agency is migrating immunization history from multiple legacy EHR systems into a new registry. The conversion runs in daily batches (e.g., BATCH_2025_01_15) and the agency must certify each batch before it can be used for clinical decision support and regulatory reporting. A single bad batch can cause incorrect vaccine forecasting, missed school compliance notifications, and audit findings.
You are given three tables: a patient dimension (patients), a reference list of program-approved vaccine codes (vaccine_code_reference), and the converted dose facts (converted_immunization_doses). Your job is to produce a batch-level quality report.
Task
Write a SQL query that returns one row per batch_id with counts of validation failures and a final pass/fail flag.
Requirements
For each batch_id, compute:
total_doses: total rows in converted_immunization_doses for that batch.
invalid_vaccine_code_cnt: doses whose vaccine_code does not exist in vaccine_code_reference.
invalid_admin_date_cnt: doses where administration_date is NULL, in the future, or earlier than the patient’s date_of_birth.
missing_lot_number_cnt: doses where the vaccine requires a lot number (requires_lot_number = TRUE) and lot_number is NULL or empty/whitespace.
duplicate_dose_cnt: count of duplicate rows beyond the first within a batch, where duplicates are defined by the natural key:
- (
batch_id, patient_id, vaccine_code, administration_date)
If a group has N rows, it contributes N-1 duplicates.
total_error_cnt: sum of the four error counts above.
meets_program_standards: TRUE if total_error_cnt = 0, else FALSE.
Return results ordered by batch_id ascending.
Table Definitions
patients
| column | type | description |
|---|
| patient_id | INT | Unique patient identifier (PK) |
| date_of_birth | DATE | Patient date of birth |
vaccine_code_reference
| column | type | description |
|---|
| vaccine_code | VARCHAR(20) | Program-approved vaccine code (PK) |
| vaccine_name | VARCHAR(100) | Vaccine display name |
| requires_lot_number | BOOLEAN | Whether lot number is required |
converted_immunization_doses
| column | type | description |
|---|
| dose_id | BIGINT | Unique converted dose record (PK) |
| batch_id | VARCHAR(40) | Conversion batch identifier |
| patient_id | INT | Patient receiving the dose (FK to patients) |
| vaccine_code | VARCHAR(20) | Converted vaccine code (FK to reference when valid) |
| administration_date | DATE | Date administered (nullable) |
| lot_number | VARCHAR(50) | Manufacturer lot number (nullable) |
| source_system | VARCHAR(30) | Legacy source system |
Sample Data
patients
| patient_id | date_of_birth |
|---|
| 101 | 2019-06-15 |
| 102 | 1985-02-10 |
| 103 | 2021-01-05 |
vaccine_code_reference
| vaccine_code | vaccine_name | requires_lot_number |
|---|
| FLU | Influenza, seasonal | TRUE |
| MMR | Measles, mumps, rubella | TRUE |
| COVID19 | COVID-19 mRNA | FALSE |
converted_immunization_doses
| dose_id | batch_id | patient_id | vaccine_code | administration_date | lot_number | source_system |
|---|
| 9001 | BATCH_2025_01_15 | 101 | FLU | 2024-10-01 | A123 | LegacyA |
| 9002 | BATCH_2025_01_15 | 101 | FLU | 2024-10-01 | A123 | LegacyA |
| 9003 | BATCH_2025_01_15 | 102 | XYZ | 2024-05-01 | Z999 | LegacyB |
| 9004 | BATCH_2025_01_15 | 103 | MMR | 2020-12-31 | | LegacyA |
| 9005 | BATCH_2025_01_16 | 102 | COVID19 | 2024-06-01 | NULL | LegacyB |
Expected Output (for the sample data)
| batch_id | total_doses | invalid_vaccine_code_cnt | invalid_admin_date_cnt | missing_lot_number_cnt | duplicate_dose_cnt | total_error_cnt | meets_program_standards |
|---|
| BATCH_2025_01_15 | 4 | 1 | 1 | 1 | 1 | 4 | FALSE |
| BATCH_2025_01_16 | 1 | 0 | 0 | 0 | 0 | 0 | TRUE |