Task
Meta's business analytics team is reviewing lead data captured from Meta Lead Ads. Some leads were submitted multiple times for the same ad, and you need to identify those duplicate records and show how you would remove them while keeping one record per duplicate group.
Write a SQL query to find duplicate lead submissions, where a duplicate is defined as the same ad, email, and submission date. Keep the earliest lead_id as the canonical record.
Requirements
- Return only duplicate records from
lead_submissions.
- For each duplicate group, show the campaign name from
ad_campaigns.
- Mark which row should be kept and which rows should be removed using
ROW_NUMBER().
- Order the output by
campaign_name, submission_date, email, and lead_id.
Table Definitions
lead_submissions
| column | type | description |
|---|
| lead_id | INT | Unique lead submission ID |
| ad_id | INT | Ad that generated the lead |
| email | VARCHAR(255) | Submitted email address |
| full_name | VARCHAR(255) | Lead name |
| submission_date | DATE | Date of submission |
| source_surface | VARCHAR(100) | Meta surface where the lead was captured |
| | |
ad_campaigns
| column | type | description |
|---|
| ad_id | INT | Unique ad ID |
| campaign_name | VARCHAR(255) | Campaign tied to the ad |
| objective | VARCHAR(100) | Campaign objective |
| | |
Sample Data
lead_submissions
ad_campaigns
| ad_id | campaign_name | objective |
|---|
| 205 | Messenger Lead Gen | Leads |
| 201 | SMB Signup Q2 | Leads |
| 202 | Reels Creator Outreach | Leads |
| 203 | Shops Seller Onboarding | Conversions |
| 204 | WhatsApp Business Trial | Leads |
| 206 | Advantage+ Retail Push | Sales |
| 207 | Creator Marketplace Growth | Leads |
| 208 | Threads Brand Awareness | Reach |
| | |
Expected Output