Design Offline Sync Event Pipeline

Scenario

You are designing the synchronization pipeline for a mobile marketplace where drivers and customers can create or update entities while offline, then reconnect hours later on unstable networks. Recent incidents showed duplicated actions, out-of-order updates, and mismatches between what the mobile app displays and what backend systems persist. Product and operations teams now want a resilient sync design that supports eventual consistency, conflict handling, auditability, and near-real-time downstream visibility for operational dashboards. The pipeline must preserve user intent while preventing duplicate writes when the client retries aggressively after reconnect.

Current State

Component	Status / Technology
Mobile event capture	iOS/Android app with local SQLite queue
API layer	Sync REST endpoints behind API gateway
Operational store	PostgreSQL primary with read replicas
Async messaging	Apache Kafka used for backend domain events
Analytics pipeline	Airflow batch jobs loading warehouse tables hourly
Monitoring	Basic API latency and DB CPU dashboards

Scale: 12M monthly active devices, 1.8M daily active devices, peak reconnect bursts of 45K requests/sec after network recovery, up to 150 queued mutations/device, payloads 1-8 KB, operational state visible within 2 seconds and warehouse freshness under 10 minutes.

Question

How would you design the end-to-end data synchronization pipeline so offline mobile mutations can be replayed safely, ordered correctly where needed, reconciled on conflict, and propagated to both operational systems and downstream analytical stores without double-processing?

Scenario

Current State

Component	Status / Technology
Mobile event capture	iOS/Android app with local SQLite queue
API layer	Sync REST endpoints behind API gateway
Operational store	PostgreSQL primary with read replicas
Async messaging	Apache Kafka used for backend domain events
Analytics pipeline	Airflow batch jobs loading warehouse tables hourly
Monitoring	Basic API latency and DB CPU dashboards

Scenario

Current State

Component	Status / Technology
Mobile event capture	iOS/Android app with local SQLite queue
API layer	Sync REST endpoints behind API gateway
Operational store	PostgreSQL primary with read replicas
Async messaging	Apache Kafka used for backend domain events
Analytics pipeline	Airflow batch jobs loading warehouse tables hourly
Monitoring	Basic API latency and DB CPU dashboards

Scenario

Current State

Component	Status / Technology
Mobile event capture	iOS/Android app with local SQLite queue
API layer	Sync REST endpoints behind API gateway
Operational store	PostgreSQL primary with read replicas
Async messaging	Apache Kafka used for backend domain events
Analytics pipeline	Airflow batch jobs loading warehouse tables hourly
Monitoring	Basic API latency and DB CPU dashboards

Interview Guides

Scenario

Current State

Question

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Your Answer

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Your Answer