Design Real-Time Ad Event Pipeline

Scenario

You are designing a real-time pipeline for an advertising platform that needs up-to-date click and impression aggregates for pacing, billing, and advertiser reporting. Today, event data lands in hourly files and downstream teams regularly see mismatches between delivery dashboards and finance reports. The gap has become an executive escalation because campaign managers need minute-level visibility during peak spend windows, while finance requires deduplicated, auditable counts. You need to support low-latency aggregation without losing the ability to reprocess historical data when schemas or attribution rules change.

Current State

Component	Status / Technology
Event Producers	Mobile clients, web clients, ad delivery servers
Ingestion	Snap Pub/Sub edge collectors writing JSON events
Processing	Hourly Spark batch jobs
Storage	S3 data lake + Snowflake warehouse
Orchestration	Apache Airflow 2.x
Consumption	Internal dashboards, billing exports, campaign pacing services

Scale: ~220K events/sec average, 900K peak during major campaigns, ~9B events/day, 1-2 KB per event, 15-minute freshness target for aggregates, 13-month retention, late events common within 2 hours.

Question

How would you design the end-to-end real-time event processing pipeline so clicks and impressions are aggregated accurately and queryable with low latency, while still supporting replay, backfills, and reconciliation against downstream billing data?

Scenario

Current State

Component	Status / Technology
Event Producers	Mobile clients, web clients, ad delivery servers
Ingestion	Snap Pub/Sub edge collectors writing JSON events
Processing	Hourly Spark batch jobs
Storage	S3 data lake + Snowflake warehouse
Orchestration	Apache Airflow 2.x
Consumption	Internal dashboards, billing exports, campaign pacing services

Scale: ~220K events/sec average, 900K peak during major campaigns, ~9B events/day, 1-2 KB per event, 15-minute freshness target for aggregates, 13-month retention, late events common within 2 hours.

Scenario

Current State

Component	Status / Technology
Event Producers	Mobile clients, web clients, ad delivery servers
Ingestion	Snap Pub/Sub edge collectors writing JSON events
Processing	Hourly Spark batch jobs
Storage	S3 data lake + Snowflake warehouse
Orchestration	Apache Airflow 2.x
Consumption	Internal dashboards, billing exports, campaign pacing services

Scale: ~220K events/sec average, 900K peak during major campaigns, ~9B events/day, 1-2 KB per event, 15-minute freshness target for aggregates, 13-month retention, late events common within 2 hours.

Scenario

Current State

Component	Status / Technology
Event Producers	Mobile clients, web clients, ad delivery servers
Ingestion	Snap Pub/Sub edge collectors writing JSON events
Processing	Hourly Spark batch jobs
Storage	S3 data lake + Snowflake warehouse
Orchestration	Apache Airflow 2.x
Consumption	Internal dashboards, billing exports, campaign pacing services

Scale: ~220K events/sec average, 900K peak during major campaigns, ~9B events/day, 1-2 KB per event, 15-minute freshness target for aggregates, 13-month retention, late events common within 2 hours.

Interview Guides

Scenario

Current State

Question

Design Real-Time Ad Event Pipeline

Scenario

Current State

Question

Your Answer

Design Real-Time Ad Event Pipeline

Scenario

Current State

Question

Design Real-Time Ad Event Pipeline

Scenario

Current State

Question

Your Answer