Context
PayFlow, a B2B payments platform, currently receives charge requests from web and mobile clients through a synchronous API backed by PostgreSQL. During client retries, network timeouts, and Airflow-triggered replay jobs, the system occasionally creates duplicate payment records and downstream settlement events.
You need to design an idempotent API ingestion pattern that guarantees a payment request is processed once per client-supplied idempotency key while still supporting retries, auditability, and downstream ETL into the analytics warehouse.
Scale Requirements
- Traffic: 2,500 API requests/second peak, 400 requests/second average
- Payload size: 1-4 KB JSON per payment request
- Latency target: API response p95 < 250 ms
- Duplicate prevention window: 24 hours per idempotency key
- Storage: 50M payment requests/day, 2 TB/month raw + audit logs
- Downstream freshness: analytics tables updated within 5 minutes
Requirements
- Implement an API endpoint that accepts an
Idempotency-Key header and safely handles client retries.
- Persist request state so the same key returns the original response instead of creating a second transaction.
- Prevent race conditions when two identical requests arrive concurrently.
- Publish successful payments to the event pipeline for downstream ETL and reconciliation.
- Store raw requests, processing status, and response payloads for audit and replay.
- Add data quality checks to detect duplicate transactions, mismatched payloads for reused keys, and missing downstream events.
- Describe how orchestration would backfill failed warehouse loads without re-charging customers.
Constraints
- AWS-based stack only; no multi-region active-active requirement
- PCI-sensitive fields must be tokenized; raw card data cannot be stored in logs or Kafka
- Incremental infrastructure budget: $15K/month
- Existing team uses Python, Airflow, PostgreSQL, and Kafka; avoid introducing niche tooling
- Payment records must be retained for 7 years for audit