Context
LedgerFlow, a B2B payments platform, runs its operational workloads on Amazon RDS for PostgreSQL and supports internal analytics from nightly exports. The company is migrating to Snowflake for analytics and selected domain services, but the current batch-only export process causes stale data, inconsistent schemas, and risky cutovers.
You need to design a migration pipeline that performs an initial historical load plus ongoing change data capture (CDC), keeps source and target synchronized during the migration window, and supports a controlled cutover with rollback.
Scale Requirements
- Source database: PostgreSQL 13 on RDS, ~12 TB total data, 4.5B rows across 180 tables
- Change volume: 25K row changes/sec peak, 6K avg
- Largest table:
payments with 1.2B rows, ~4 TB
- Latency target: CDC data visible in Snowflake within 3 minutes
- Migration window: complete initial load in 7 days without impacting OLTP p95 latency by more than 10%
- Retention: raw CDC logs for 30 days, curated warehouse tables indefinitely
Requirements
- Design an initial bulk-load strategy for large tables and a CDC pipeline for ongoing updates.
- Ensure idempotent processing so retries do not create duplicates or inconsistent state.
- Handle schema evolution, late-arriving changes, deletes, and table backfills.
- Provide table-level and row-level reconciliation between PostgreSQL and Snowflake before cutover.
- Orchestrate dependencies across extraction, load, validation, and promotion steps.
- Define a rollback plan if data quality checks fail after cutover.
- Support selective reprocessing for a single table or date range.
Constraints
- AWS is the required cloud environment; existing tooling includes Airflow and S3.
- Budget for incremental migration infrastructure is $30K/month.
- Source database is customer-facing; long-running locks and full-table scans during business hours are not allowed.
- Compliance: SOC 2 and PCI scope reduction; sensitive columns must be encrypted or tokenized before broad analytics access.