
NovaCart stores customer signups in a raw intake table and sometimes receives duplicate records for the same person. Write a SQL query to identify duplicates and mark which rows should be kept versus flagged for review.
email_address and signup_date.created_at; if there is a tie, keep the smallest customer_id.record_status column with values keep or flag_duplicate.customer_signups| Column | Type | Description |
|---|---|---|
| customer_id | INT | Unique row identifier |
| full_name | VARCHAR(100) | Customer name as received from source systems |
| email_address | VARCHAR(255) | Customer email; may be NULL |
| signup_date | DATE | Date the signup is attributed to |
| source_system | VARCHAR(50) | Source that created the record |
| created_at | TIMESTAMP | Time the row was inserted |
| is_active | BOOLEAN | Whether the signup is currently active |
| customer_id | full_name | email_address | signup_date | source_system | created_at | is_active |
|---|---|---|---|---|---|---|
| 105 | Ava Reed | ava@example.com | 2024-02-03 | mobile_app | 2024-02-03 09:00:00 | true |
| 101 | Ava Reed | ava@example.com | 2024-02-03 | web | 2024-02-03 08:15:00 | true |
| 108 | Noah Kim | noah@example.com | 2024-02-05 | partner_api | 2024-02-05 12:00:00 | false |
| 103 | Noah Kim | noah@example.com | 2024-02-05 | web | 2024-02-05 10:30:00 | true |
| 110 | Mia Chen | mia@example.com | 2024-02-07 | web | 2024-02-07 11:00:00 | true |
| duplicate_key | customer_id | email_address | signup_date | created_at | row_rank | record_status |
|---|---|---|---|---|---|---|
| ava@example.com | 2024-02-03 | 101 | ava@example.com | 2024-02-03 | 2024-02-03 08:15:00 | 1 |
| ava@example.com | 2024-02-03 | 105 | ava@example.com | 2024-02-03 | 2024-02-03 09:00:00 | 2 |
| noah@example.com | 2024-02-05 | 103 | noah@example.com | 2024-02-05 | 2024-02-05 10:30:00 | 1 |
| noah@example.com | 2024-02-05 | 108 | noah@example.com | 2024-02-05 | 2024-02-05 12:00:00 | 2 |
| Column | Type | Description |
|---|---|---|
| customer_idPK | INT | Unique row identifier for each signup record |
| full_name | VARCHAR(100) | Customer name from the source system |
| email_address | VARCHAR(255) | Customer email used to detect duplicates |
| signup_date | DATE | Business signup date used in duplicate matching |
| source_system | VARCHAR(50) | Source that created the signup record |
| created_at | TIMESTAMP | Insertion timestamp used to decide which duplicate to keep |
| is_active | BOOLEAN | Whether the signup is currently active |
| customer_id | full_name | email_address | signup_date | source_system | created_at | is_active |
|---|---|---|---|---|---|---|
| 105 | Ava Reed | ava@example.com | 2024-02-03 | mobile_app | 2024-02-03 09:00:00 | true |
| 101 | Ava Reed | ava@example.com | 2024-02-03 | web | 2024-02-03 08:15:00 | true |
| 112 | Liam Stone | liam@example.com | 2024-02-10 | web | 2024-02-10 07:30:00 | true |
| 108 | Noah Kim | noah@example.com | 2024-02-05 | partner_api | 2024-02-05 12:00:00 | false |
| 103 | Noah Kim | noah@example.com | 2024-02-05 | web | 2024-02-05 10:30:00 | true |
| 110 | Mia Chen | mia@example.com | 2024-02-07 | web | 2024-02-07 11:00:00 | true |
| 115 | Emma Park | 2024-02-08 | mobile_app | 2024-02-08 09:20:00 | true | |
| 116 | Emma Park | 2024-02-08 | web | 2024-02-08 09:25:00 | false | |
| 118 | Owen Blake | owen@example.com | partner_api | 2024-02-09 14:00:00 | true | |
| 120 | Zoe Hart | zoe@example.com | 2024-02-11 | web | 2024-02-11 08:00:00 | true |
| 119 | Zoe Hart | zoe@example.com | 2024-02-11 | mobile_app | 2024-02-11 08:00:00 | false |
| 121 | Ivy Cole | 2024-02-12 | web | 2024-02-12 10:00:00 | true | |
| 1 | Ava Reed | zoe@example.com | null | mobile_app | 2024-01-31 12:36:44 | true |
| 2 | Emma Park | null | null | mobile_app | 2024-02-11 17:04:20 | false |
| 3 | Liam Stone | noah@example.com | 2024-02-03 | mobile_app | 2024-02-13 11:10:55 | false |
| 4 | Noah Kim | zoe@example.com | 2024-02-14 | web | 2024-02-12 15:24:53 | false |
| 5 | Emma Park | mia@example.com | 2024-02-03 | partner_api | 2024-02-12 16:56:19 | true |
| 6 | Emma Park | noah@example.com | 2024-02-09 | web | 2024-02-08 16:57:08 | true |
| 7 | Ivy Cole | liam@example.com | 2024-02-10 | partner_api | 2024-02-06 09:28:33 | true |
| 8 | Noah Kim | noah@example.com | 2024-02-01 | web | 2024-02-14 00:19:06 | true |
| 9 | Emma Park | liam@example.com | 2024-02-02 | web | 2024-02-01 03:54:58 | true |
| 10 | Zoe Hart | ava@example.com | 2024-02-14 | mobile_app | 2024-02-14 16:20:54 | false |
| 11 | Mia Chen | noah@example.com | 2024-02-01 | web | 2024-02-03 04:24:25 | false |
| 12 | Owen Blake | null | 2024-02-07 | web | 2024-02-04 01:47:16 | false |
| 13 | Emma Park | noah@example.com | 2024-02-15 | mobile_app | 2024-02-03 02:52:46 | false |
| 14 | Emma Park | noah@example.com | 2024-02-08 | web | 2024-02-15 07:25:34 | false |
| 15 | Ava Reed | noah@example.com | 2024-02-03 | mobile_app | 2024-02-11 20:08:38 | true |
| duplicate_key | customer_id | email_address | signup_date | created_at | row_rank | record_status |
|---|---|---|---|---|---|---|
| ava@example.com|2024-02-03 | 101 | ava@example.com | 2024-02-03 | 2024-02-03 08:15:00 | 1 | keep |
| ava@example.com|2024-02-03 | 105 | ava@example.com | 2024-02-03 | 2024-02-03 09:00:00 | 2 | flag_duplicate |
| liam@example.com|2024-02-10 | 7 | liam@example.com | 2024-02-10 | 2024-02-06 09:28:33 | 1 | keep |
| liam@example.com|2024-02-10 | 112 | liam@example.com | 2024-02-10 | 2024-02-10 07:30:00 | 2 | flag_duplicate |
| noah@example.com|2024-02-01 | 11 | noah@example.com | 2024-02-01 | 2024-02-03 04:24:25 | 1 | keep |
| noah@example.com|2024-02-01 | 8 | noah@example.com | 2024-02-01 | 2024-02-14 00:19:06 | 2 | flag_duplicate |
| noah@example.com|2024-02-03 | 15 | noah@example.com | 2024-02-03 | 2024-02-11 20:08:38 | 1 | keep |
| noah@example.com|2024-02-03 | 3 | noah@example.com | 2024-02-03 | 2024-02-13 11:10:55 | 2 | flag_duplicate |
| noah@example.com|2024-02-05 | 103 | noah@example.com | 2024-02-05 | 2024-02-05 10:30:00 | 1 | keep |
| noah@example.com|2024-02-05 | 108 | noah@example.com | 2024-02-05 | 2024-02-05 12:00:00 | 2 | flag_duplicate |
| zoe@example.com|2024-02-11 | 119 | zoe@example.com | 2024-02-11 | 2024-02-11 08:00:00 | 1 | keep |
| zoe@example.com|2024-02-11 | 120 | zoe@example.com | 2024-02-11 | 2024-02-11 08:00:00 | 2 | flag_duplicate |