Task
Meta’s event pipeline for Facebook activity can receive the same logical event multiple times, and some duplicates arrive late. Write a SQL query to return one canonical row per event_id, keeping the most recently ingested record.
Requirements
- Only consider events with a non-null
event_id.
- Deduplicate by
event_id across the full table, even if duplicates arrive on different ingestion dates.
- Keep the row with the greatest
ingested_at for each event_id.
- If two rows for the same
event_id have the same ingested_at, keep the one with the greater record_id.
- Return
event_id, user_id, event_name, event_time, and ingested_at.
- Order the final result by
event_id ascending.
Table Definition
| column | type | description |
|---|
| record_id | INT | Unique row identifier for each ingested record |
| event_id | VARCHAR(50) | Logical event identifier used for deduplication |
| user_id | INT | Meta user identifier |
| event_name | VARCHAR(50) | Event type |
| event_time | TIMESTAMP | Time the event actually occurred |
| ingested_at | TIMESTAMP | Time the record was received by the warehouse |
| source_surface | VARCHAR(50) | Meta surface that produced the event |
Sample Data
| record_id | event_id | user_id | event_name | event_time | ingested_at | source_surface |
|---|
| 103 | evt_1003 | 12 | like | 2024-04-01 09:02:00 | 2024-04-01 09:02:10 | Facebook |
| 101 | evt_1001 | 10 | impression | 2024-04-01 09:00:00 | 2024-04-01 09:00:05 | Facebook |
| 108 | evt_1005 | 14 | click | 2024-04-01 09:05:00 | 2024-04-01 09:05:20 | Reels |
| 110 | evt_1006 | 15 | comment | 2024-04-01 09:06:00 | 2024-04-01 09:06:10 | Facebook |
| 112 | evt_1007 | 16 | share | 2024-04-01 09:07:00 | 2024-04-01 09:07:05 | Instagram |
Expected Output
| event_id | user_id | event_name | event_time | ingested_at |
|---|
| evt_1001 | 10 | impression | 2024-04-01 09:00:00 | 2024-04-01 09:10:00 |
| evt_1002 | 11 | click | 2024-04-01 09:01:00 | 2024-04-01 09:01:03 |
| evt_1003 | 12 | like | 2024-04-01 09:02:00 | 2024-04-01 09:12:00 |
| evt_1004 | 13 | impression | 2024-04-01 09:03:00 | 2024-04-01 09:03:05 |
| evt_1005 | 14 | click | 2024-04-01 09:05:00 | 2024-04-01 09:05:20 |
| evt_1006 | 15 | comment | 2024-04-01 09:06:00 | 2024-04-01 09:06:10 |
| evt_1007 | 16 | share | 2024-04-01 09:07:00 | 2024-04-01 09:07:05 |