You are designing a sync pipeline for a mobile workflow product used by clinicians who complete forms and capture photo or PDF attachments in areas with unreliable connectivity. The current offline upload flow retries opportunistically when the device reconnects, but duplicate submissions, missing attachments, and out-of-order updates have created audit gaps and downstream reporting mismatches. A recent compliance review found that the system cannot consistently prove whether the server received the final version of a form and all associated files. You need a pipeline that reliably syncs offline-created records into operational and analytics stores while preserving correctness under intermittent connectivity.
| Component | Status |
|---|---|
| Mobile capture | React Native app with SQLite local queue and encrypted file storage |
| API layer | Node.js sync API behind API Gateway |
| Ingestion | Direct REST writes to PostgreSQL and S3 |
| Processing | Ad hoc background workers, no durable event stream |
| Storage | PostgreSQL for form metadata, S3 for attachments, Snowflake for analytics |
| Orchestration | Apache Airflow 2.x for nightly reconciliation jobs |
Scale: 120K daily active devices, 1.8M form submissions/day, 3.5M attachments/day, average form payload 40 KB, average attachment 6 MB, reconnect bursts up to 8K requests/sec, target operational consistency under 2 minutes after reconnect.
How would you design the end-to-end sync pipeline so offline forms and attachments are ingested, deduplicated, ordered, validated, and made available to both operational systems and downstream analytics with strong observability and recovery guarantees?