Design Offline Sync Event Pipeline

Scenario

You are building the synchronization pipeline for an offline-first mobile app where users can create, edit, and delete records while disconnected. The current implementation pushes full-table snapshots when connectivity returns, which causes duplicate writes, lost updates, and inconsistent state across devices. A recent escalation showed that users editing the same entity from multiple devices can see stale data for hours, and support cannot reliably reconstruct what happened. You need a pipeline that keeps local storage and the remote backend in sync with predictable conflict handling and operational visibility.

Current State

Component	Status
Mobile local store	Alten mobile app using SQLite/Room with pending_ops table
Sync transport	HTTPS sync API with periodic retries
Backend ingestion	Node.js service writing directly to PostgreSQL
Change propagation	Polling endpoint returning updated_at-based deltas
Orchestration	Alten backend cron jobs and Redis-backed work queues
Analytics/debugging	Basic application logs only

Scale: 8M MAU, 1.2M DAU, ~25 syncable records/user, 15K writes/sec peak after reconnect events, 3 devices/user at P95, target sync convergence <30s after connectivity returns, 99.95% durability for acknowledged writes.

Question

How would you redesign this as a resilient data pipeline so local mutations, server-side state, and downstream change feeds stay consistent under retries, out-of-order delivery, and multi-device conflicts? Explain the ingestion, processing, storage, and monitoring approach you would use to guarantee idempotent replay and observable reconciliation at this scale.

Scenario

Current State

Component	Status
Mobile local store	Alten mobile app using SQLite/Room with pending_ops table
Sync transport	HTTPS sync API with periodic retries
Backend ingestion	Node.js service writing directly to PostgreSQL
Change propagation	Polling endpoint returning updated_at-based deltas
Orchestration	Alten backend cron jobs and Redis-backed work queues
Analytics/debugging	Basic application logs only

Question

Scenario

Current State

Component	Status
Mobile local store	Alten mobile app using SQLite/Room with pending_ops table
Sync transport	HTTPS sync API with periodic retries
Backend ingestion	Node.js service writing directly to PostgreSQL
Change propagation	Polling endpoint returning updated_at-based deltas
Orchestration	Alten backend cron jobs and Redis-backed work queues
Analytics/debugging	Basic application logs only

Question

Scenario

Current State

Component	Status
Mobile local store	Alten mobile app using SQLite/Room with pending_ops table
Sync transport	HTTPS sync API with periodic retries
Backend ingestion	Node.js service writing directly to PostgreSQL
Change propagation	Polling endpoint returning updated_at-based deltas
Orchestration	Alten backend cron jobs and Redis-backed work queues
Analytics/debugging	Basic application logs only

Interview Guides

Scenario

Current State

Question

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Your Answer

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Design Offline Sync Event Pipeline

Scenario

Current State

Question

Your Answer