Containerize ETL Platform Migration

Context

FinPulse, a mid-sized fintech company, runs nightly ETL pipelines on long-lived EC2 VMs using Python, Airflow, and PostgreSQL. Deployments are inconsistent across environments, dependency conflicts frequently break jobs, and the platform team wants to standardize execution using Docker and Kubernetes while improving reliability and observability.

You are asked to design a containerized data platform for batch ETL and light streaming workloads. The new system must support reproducible builds, isolated runtime environments, and controlled rollouts for pipeline code.

Scale Requirements

Batch jobs: 1,200 Airflow task runs/day across 80 DAGs
Streaming jobs: 15 low-latency consumers processing ~25K events/sec total
Data volume: 6 TB/day ingested from application databases, S3 drops, and Kafka topics
Latency targets: batch SLA < 45 minutes per critical DAG; streaming freshness < 2 minutes
Retention: raw data 180 days in object storage; curated warehouse tables retained indefinitely

Requirements

Design a Docker-based packaging strategy for ETL jobs, Airflow workers, and shared libraries.
Use Kubernetes to schedule and isolate workloads across dev, staging, and prod.
Support both scheduled batch pipelines and continuously running stream consumers.
Implement CI/CD for image build, vulnerability scanning, versioning, and rollout.
Ensure idempotent reruns, backfills, and environment-specific configuration management.
Add data quality checks before warehouse loads and route failed records for reprocessing.
Define monitoring for container health, job failures, resource saturation, and SLA breaches.

Constraints

Existing stack is AWS-based: EKS, S3, RDS PostgreSQL, Kafka, and Snowflake.
Team has strong Docker experience but limited Kubernetes operations expertise.
Incremental platform budget is capped at $18K/month.
Compliance requires image provenance, secrets management, and audit logs for production deployments.

Context

Scale Requirements

Batch jobs: 1,200 Airflow task runs/day across 80 DAGs
Streaming jobs: 15 low-latency consumers processing ~25K events/sec total
Data volume: 6 TB/day ingested from application databases, S3 drops, and Kafka topics
Latency targets: batch SLA < 45 minutes per critical DAG; streaming freshness < 2 minutes
Retention: raw data 180 days in object storage; curated warehouse tables retained indefinitely

Requirements

Design a Docker-based packaging strategy for ETL jobs, Airflow workers, and shared libraries.
Use Kubernetes to schedule and isolate workloads across dev, staging, and prod.
Support both scheduled batch pipelines and continuously running stream consumers.
Implement CI/CD for image build, vulnerability scanning, versioning, and rollout.
Ensure idempotent reruns, backfills, and environment-specific configuration management.
Add data quality checks before warehouse loads and route failed records for reprocessing.
Define monitoring for container health, job failures, resource saturation, and SLA breaches.

Constraints

Existing stack is AWS-based: EKS, S3, RDS PostgreSQL, Kafka, and Snowflake.
Team has strong Docker experience but limited Kubernetes operations expertise.
Incremental platform budget is capped at $18K/month.
Compliance requires image provenance, secrets management, and audit logs for production deployments.

Context

Scale Requirements

Batch jobs: 1,200 Airflow task runs/day across 80 DAGs
Streaming jobs: 15 low-latency consumers processing ~25K events/sec total
Data volume: 6 TB/day ingested from application databases, S3 drops, and Kafka topics
Latency targets: batch SLA < 45 minutes per critical DAG; streaming freshness < 2 minutes
Retention: raw data 180 days in object storage; curated warehouse tables retained indefinitely

Requirements

Design a Docker-based packaging strategy for ETL jobs, Airflow workers, and shared libraries.
Use Kubernetes to schedule and isolate workloads across dev, staging, and prod.
Support both scheduled batch pipelines and continuously running stream consumers.
Implement CI/CD for image build, vulnerability scanning, versioning, and rollout.
Ensure idempotent reruns, backfills, and environment-specific configuration management.
Add data quality checks before warehouse loads and route failed records for reprocessing.
Define monitoring for container health, job failures, resource saturation, and SLA breaches.

Constraints

Existing stack is AWS-based: EKS, S3, RDS PostgreSQL, Kafka, and Snowflake.
Team has strong Docker experience but limited Kubernetes operations expertise.
Incremental platform budget is capped at $18K/month.
Compliance requires image provenance, secrets management, and audit logs for production deployments.

Context

Scale Requirements

Batch jobs: 1,200 Airflow task runs/day across 80 DAGs
Streaming jobs: 15 low-latency consumers processing ~25K events/sec total
Data volume: 6 TB/day ingested from application databases, S3 drops, and Kafka topics
Latency targets: batch SLA < 45 minutes per critical DAG; streaming freshness < 2 minutes
Retention: raw data 180 days in object storage; curated warehouse tables retained indefinitely

Requirements

Design a Docker-based packaging strategy for ETL jobs, Airflow workers, and shared libraries.
Use Kubernetes to schedule and isolate workloads across dev, staging, and prod.
Support both scheduled batch pipelines and continuously running stream consumers.
Implement CI/CD for image build, vulnerability scanning, versioning, and rollout.
Ensure idempotent reruns, backfills, and environment-specific configuration management.
Add data quality checks before warehouse loads and route failed records for reprocessing.
Define monitoring for container health, job failures, resource saturation, and SLA breaches.

Constraints

Existing stack is AWS-based: EKS, S3, RDS PostgreSQL, Kafka, and Snowflake.
Team has strong Docker experience but limited Kubernetes operations expertise.
Incremental platform budget is capped at $18K/month.
Compliance requires image provenance, secrets management, and audit logs for production deployments.

Interview Guides

Context

Scale Requirements

Requirements

Constraints

Containerize ETL Platform Migration

Context

Scale Requirements

Requirements

Constraints

Your Answer

Containerize ETL Platform Migration

Context

Scale Requirements

Requirements

Constraints

Containerize ETL Platform Migration

Context

Scale Requirements

Requirements

Constraints

Your Answer