Scale Email Event Ingestion Pipeline

Scenario

You own the ingestion pipeline for a B2B SaaS security product that processes email security telemetry used by detection systems and customer-facing analytics. The current design works at steady state, but a new enterprise rollout is expected to increase inbound traffic by 10x over the next quarter. Recent incidents showed delayed downstream updates, duplicate records during retries, and inconsistent counts between raw and curated datasets. You need a minimalistic redesign that can absorb the traffic increase without introducing a large operational footprint.

Current State

Component	Status / Technology
Event Sources	Email gateway webhooks, API collectors, internal app events
Ingestion	Python services writing directly to PostgreSQL
Processing	Cron-based Python ETL every 15 minutes
Storage	PostgreSQL for raw and transformed data
Orchestration	Basic cron on Kubernetes
Serving	Detection features and internal dashboards

Scale: 25K events/sec peak today, expected 250K events/sec peak after rollout; average payload 3-5 KB JSON; current freshness is 15-20 minutes; target is under 3 minutes for curated tables; 30-day hot retention and 1-year cold retention.

Question

How would you redesign this ingestion pipeline in the simplest production-ready way to handle 10x traffic while preserving data quality, replayability, and operational visibility? Walk through the architecture, scaling decisions, and the trade-offs you would make to keep the system intentionally minimal.

Scenario

Current State

Component	Status / Technology
Event Sources	Email gateway webhooks, API collectors, internal app events
Ingestion	Python services writing directly to PostgreSQL
Processing	Cron-based Python ETL every 15 minutes
Storage	PostgreSQL for raw and transformed data
Orchestration	Basic cron on Kubernetes
Serving	Detection features and internal dashboards

Question

Scenario

Current State

Component	Status / Technology
Event Sources	Email gateway webhooks, API collectors, internal app events
Ingestion	Python services writing directly to PostgreSQL
Processing	Cron-based Python ETL every 15 minutes
Storage	PostgreSQL for raw and transformed data
Orchestration	Basic cron on Kubernetes
Serving	Detection features and internal dashboards

Question

Scenario

Current State

Component	Status / Technology
Event Sources	Email gateway webhooks, API collectors, internal app events
Ingestion	Python services writing directly to PostgreSQL
Processing	Cron-based Python ETL every 15 minutes
Storage	PostgreSQL for raw and transformed data
Orchestration	Basic cron on Kubernetes
Serving	Detection features and internal dashboards

Interview Guides

Scenario

Current State

Question

Scale Email Event Ingestion Pipeline

Scenario

Current State

Question

Your Answer

Scale Email Event Ingestion Pipeline

Scenario

Current State

Question

Scale Email Event Ingestion Pipeline

Scenario

Current State

Question

Your Answer