Business Context
A fintech company is migrating millions of customer profiles from a legacy KYC platform to a new customer system ahead of a regulatory audit. During the cutover window, the data engineering team generates a customer_id_map table that links each legacy_customer_id to a new_customer_id. Because migration jobs can be re-run, the mapping table may contain duplicate mappings (one legacy ID mapped to multiple new IDs). Additionally, some customers may be missing from the new system or have mismatched attributes (e.g., KYC status).
Your task is to produce a triage report that helps the cutover team quickly identify the most severe issues first.
Task
Write a SQL query that returns one row per legacy customer, showing the mapped new customer (if any), match flags for key attributes, and a single migration status.
Requirements
- Return one row per
legacy_customers.legacy_customer_id.
- Compute
email_match (1/0) using case-insensitive email comparison.
- Compute
kyc_match (1/0) using exact string match.
- Compute
migration_status with this precedence:
MISSING_MAPPING if no mapping exists
DUPLICATE_MAPPING if a legacy ID maps to more than one distinct new ID
MISSING_IN_NEW if mapped but the new_customers record does not exist
ATTRIBUTE_MISMATCH if email or KYC differs
OK otherwise
- For display, choose the representative mapped
new_customer_id as the minimum mapped ID, and show latest_mapping_created_at as the maximum mapping timestamp.
- Order results by severity:
DUPLICATE_MAPPING, ATTRIBUTE_MISMATCH, MISSING_MAPPING, OK (then by legacy_customer_id).
Table Definitions
legacy_customers
| column | type | description |
|---|
| legacy_customer_id | INT | Primary key in the legacy system |
| full_name | VARCHAR(100) | Customer name in legacy |
| email | VARCHAR(255) | Customer email in legacy |
| kyc_status | VARCHAR(20) | Legacy KYC status |
| created_at | TIMESTAMP | When the legacy record was created |
customer_id_map
| column | type | description |
|---|
| legacy_customer_id | INT | Legacy customer ID being migrated |
| new_customer_id | INT | Mapped customer ID in the new system |
| created_at | TIMESTAMP | When the mapping row was generated |
new_customers
| column | type | description |
|---|
| new_customer_id | INT | Primary key in the new system |
| email | VARCHAR(255) | Customer email in the new system |
| kyc_status | VARCHAR(20) | New-system KYC status |
| activated_at | TIMESTAMP | When the customer was activated |
Sample Data
(See the provided sample rows in each table.)
Expected Output (for the sample data)
| legacy_customer_id | new_customer_id | latest_mapping_created_at | email_match | kyc_match | migration_status |
|---|
| 102 | 9002 | 2024-02-01 01:05:00 | 1 | 1 | DUPLICATE_MAPPING |
| 103 | 9004 | 2024-02-01 01:00:00 | 1 | 0 | ATTRIBUTE_MISMATCH |
| 104 | NULL | NULL | 0 | 0 | MISSING_MAPPING |
| 101 | 9001 | 2024-02-01 01:00:00 | 1 | 1 | OK |