Context
BoltCart, a large e-commerce marketplace, currently updates inventory through batch ETL jobs and periodic database syncs. That approach works for normal traffic, but during flash sales it causes stale stock counts, overselling, and delayed warehouse visibility.
You need to design a real-time data pipeline that ingests inventory mutations from order placement, payment confirmation, cancellations, returns, warehouse scans, and supplier restocks. The system must keep operational inventory accurate for checkout decisions while also publishing analytics-ready inventory state for downstream reporting and reconciliation.
Scale Requirements
- Peak throughput: 150K inventory events/second during flash sales, 20K sustained average
- Entities: 12M SKUs across 40 warehouses and 18 regional storefronts
- Event size: 1-2 KB JSON/Avro messages
- Latency target: P99 inventory update propagation to serving store in < 800 ms
- Analytics freshness: inventory fact tables available in warehouse within < 3 minutes
- Retention: raw immutable event log for 180 days; curated warehouse history for 2 years
Requirements
- Design an ingestion layer for inventory-affecting events from checkout, OMS, WMS, ERP, and returns systems.
- Guarantee idempotent processing and prevent double-decrements caused by retries or duplicate messages.
- Support per-SKU, per-warehouse ordering where required, while still scaling horizontally.
- Maintain a low-latency serving view for available-to-promise inventory used by checkout and product pages.
- Build an analytical pipeline for reconciliation, stockout analysis, and flash-sale postmortems.
- Define monitoring, alerting, replay, and backfill strategies for late or failed events.
- Explain how you would handle partial outages without overselling inventory.
Constraints
- Primary cloud is AWS; existing stack includes Aurora PostgreSQL, S3, and Airflow.
- Incremental budget is capped at $40K/month.
- Payment and checkout systems cannot tolerate more than 50 ms additional synchronous latency.
- Auditability is required for every stock mutation, and PII must not enter the streaming topics.