Context
Swiss Re is migrating a core underwriting and claims reporting workload from an on-prem Oracle estate to the Swiss Re analytics platform on Snowflake. Today, downstream finance, actuarial, and risk teams depend on nightly ETL jobs, but the current migration approach is manual, hard to validate, and creates unacceptable cutover risk.
Design a database migration pipeline that supports an initial historical load plus ongoing change data capture (CDC), with controlled cutover and rollback. Assume the source system contains policy, claim, premium, and reference tables with mixed OLTP and reporting usage.
Scale Requirements
- Source size: 120 TB across 4,500 tables
- Largest tables: 8-12 billion rows each
- Change volume: 35K row changes/second peak, 8K average
- Migration window: historical backfill completed in 6 weeks
- Freshness target: target tables in Snowflake within 5 minutes of source commit
- Validation SLA: row-count and checksum validation completed within 30 minutes per migrated batch
- Availability: cutover with <15 minutes read-only window
Requirements
- Design the end-to-end pipeline for bulk backfill, CDC ingestion, transformation, and publish into Snowflake.
- Ensure idempotent reprocessing for failed batches, duplicate CDC events, and replay during rollback testing.
- Define how schema evolution is handled for added columns, datatype mismatches, and table renames.
- Propose data quality controls for completeness, reconciliation, referential integrity, and business-critical field validation.
- Describe orchestration for thousands of table-level dependencies, migration waves, and backfills.
- Explain cutover strategy, dual-run period, and rollback plan if downstream reconciliation fails.
- Include monitoring, alerting, and operational ownership for migration progress.
Constraints
- Must use Snowflake as the target analytical store on the Swiss Re platform.
- Source Oracle systems cannot sustain more than 10% additional load during business hours.
- Regulatory datasets require auditability, lineage, and reproducible re-runs.
- Budget does not allow parallel full-copy environments for all domains.
- PII and financial data must remain encrypted in transit and at rest, with access controlled by least privilege.