Coordinate Cross-Team Pipeline Dependencies

Context

AcmeHealth, a B2B healthcare analytics company, runs nightly ETL pipelines that ingest client SFTP files, validate them, transform them into Snowflake marts, and publish QA-approved datasets to client-facing dashboards. Today, dependencies across engineering, QA, and client teams are tracked manually in spreadsheets and Slack, causing missed handoffs, delayed releases, and unclear ownership when upstream files or validation sign-offs are late.

You need to design a dependency-aware pipeline orchestration process that makes technical and human dependencies explicit, blocks downstream execution when prerequisites are unmet, and provides clear visibility into status, SLA risk, and failure recovery.

Scale Requirements

Clients: 180 enterprise clients
Inbound feeds: 1,200 daily files across SFTP and API pulls
Daily volume: 2.5 TB raw CSV/JSON data
Pipeline runs: ~8,000 Airflow task instances/day
Latency target: client dashboards updated by 6:00 AM local client time
QA throughput: 300 validation suites/night across staging and production
Retention: 1 year raw, 3 years curated warehouse tables

Requirements

Model dependencies across three groups: engineering-owned ingestion/transforms, QA-owned validation/sign-off, and client-owned file delivery/SLA commitments.
Design orchestration that supports both automated dependencies (task completion, data quality checks) and manual gates (QA approval, client exception acknowledgment).
Prevent downstream loads when upstream files are missing, schema checks fail, or QA approval is incomplete.
Support idempotent reruns, backfills for missed client deliveries, and per-client dependency overrides.
Provide status dashboards showing blocked tasks, dependency owners, expected unblock times, and SLA breach risk.
Define monitoring, alerting, and escalation paths for late files, failed validations, and stuck approvals.

Constraints

Existing stack is AWS + Snowflake; avoid introducing more than one major new platform.
Team has 3 data engineers, 2 QA analysts, and limited on-call coverage overnight.
Must support HIPAA-aligned auditability for approvals and data release events.
Incremental budget cap: $15K/month.

Context

Scale Requirements

Clients: 180 enterprise clients

Inbound feeds: 1,200 daily files across SFTP and API pulls

Daily volume: 2.5 TB raw CSV/JSON data

Pipeline runs: ~8,000 Airflow task instances/day

Latency target: client dashboards updated by 6:00 AM local client time

QA throughput: 300 validation suites/night across staging and production

Retention: 1 year raw, 3 years curated warehouse tables

Requirements

Model dependencies across three groups: engineering-owned ingestion/transforms, QA-owned validation/sign-off, and client-owned file delivery/SLA commitments.

Design orchestration that supports both automated dependencies (task completion, data quality checks) and manual gates (QA approval, client exception acknowledgment).

Prevent downstream loads when upstream files are missing, schema checks fail, or QA approval is incomplete.

Support idempotent reruns, backfills for missed client deliveries, and per-client dependency overrides.

Provide status dashboards showing blocked tasks, dependency owners, expected unblock times, and SLA breach risk.

Define monitoring, alerting, and escalation paths for late files, failed validations, and stuck approvals.

Problem

Context

Scale Requirements

Requirements

Constraints

Coordinate Cross-Team Pipeline Dependencies

Problem

Context

Scale Requirements

Requirements

Constraints