A fintech company is migrating millions of customer profiles from a legacy KYC platform to a new customer system ahead of a regulatory audit. During the cutover window, the data engineering team generates a customer_id_map table that links each legacy_customer_id to a new_customer_id. Because migration jobs can be re-run, the mapping table may contain duplicate mappings (one legacy ID mapped to multiple new IDs). Additionally, some customers may be missing from the new system or have mismatched attributes (e.g., KYC status).
Your task is to produce a triage report that helps the cutover team quickly identify the most severe issues first.
Write a SQL query that returns one row per legacy customer, showing the mapped new customer (if any), match flags for key attributes, and a single migration status.
legacy_customers.legacy_customer_id.email_match (1/0) using case-insensitive email comparison.kyc_match (1/0) using exact string match.migration_status with this precedence:
MISSING_MAPPING if no mapping existsDUPLICATE_MAPPING if a legacy ID maps to more than one distinct new IDMISSING_IN_NEW if mapped but the new_customers record does not existATTRIBUTE_MISMATCH if email or KYC differsOK otherwisenew_customer_id as the minimum mapped ID, and show latest_mapping_created_at as the maximum mapping timestamp.DUPLICATE_MAPPING, ATTRIBUTE_MISMATCH, MISSING_MAPPING, OK (then by legacy_customer_id).legacy_customers| column | type | description |
|---|---|---|
| legacy_customer_id | INT | Primary key in the legacy system |
| full_name | VARCHAR(100) | Customer name in legacy |
| VARCHAR(255) | Customer email in legacy | |
| kyc_status | VARCHAR(20) | Legacy KYC status |
| created_at | TIMESTAMP | When the legacy record was created |
customer_id_map| column | type | description |
|---|---|---|
| legacy_customer_id | INT | Legacy customer ID being migrated |
| new_customer_id | INT | Mapped customer ID in the new system |
| created_at | TIMESTAMP | When the mapping row was generated |
new_customers| column | type | description |
|---|---|---|
| new_customer_id | INT | Primary key in the new system |
| VARCHAR(255) | Customer email in the new system | |
| kyc_status | VARCHAR(20) | New-system KYC status |
| activated_at | TIMESTAMP | When the customer was activated |
(See the provided sample rows in each table.)
| legacy_customer_id | new_customer_id | latest_mapping_created_at | email_match | kyc_match | migration_status |
|---|---|---|---|---|---|
| 102 | 9002 | 2024-02-01 01:05:00 | 1 | 1 | DUPLICATE_MAPPING |
| 103 | 9004 | 2024-02-01 01:00:00 | 1 | 0 | ATTRIBUTE_MISMATCH |
| 104 | NULL | NULL | 0 | 0 | MISSING_MAPPING |
| 101 | 9001 | 2024-02-01 01:00:00 | 1 | 1 | OK |