Context
Meta’s Ads Insights platform ingests campaign performance events from multiple producers, including Ads Delivery, Billing, and conversion reporting systems. Today, downstream Hive and Presto consumers often discover breaking schema changes only after scheduled Airflow backfills or near-real-time Flink jobs fail, causing dashboard delays and inconsistent metrics.
You are asked to design a data contract framework for shared pipelines so producer teams can evolve schemas safely while consumer teams retain predictable SLAs. The goal is to define what a data contract is, who owns it, and how contract enforcement should work across batch and streaming systems.
Scale Requirements
- Producers: 40+ upstream datasets across Ads and Measurement
- Volume: 3B events/day, peak 250K events/sec on streaming topics
- Consumers: 150+ downstream tables, ML features, and reporting jobs
- Latency: streaming validation under 2 minutes; batch contract checks before hourly publication
- Retention: 180 days raw, 3 years curated aggregates
Requirements
- Define the contents of a data contract for Meta pipelines: schema, semantics, freshness SLA, quality thresholds, ownership, and change policy.
- Specify ownership boundaries between producer teams, central data platform, and downstream consumers.
- Design enforcement for both streaming ingestion (Kafka/Flink) and batch publication (Hive/Spark).
- Support backward-compatible schema evolution, versioning, and deprecation windows.
- Prevent bad data from reaching curated datasets using automated validation, quarantine, and rollback paths.
- Expose contract status, violations, and lineage in a way that on-call engineers and analysts can act on quickly.
- Describe how orchestration should block downstream publishes when contract checks fail, while still allowing controlled overrides.
Constraints
- Prefer Meta-adjacent infrastructure: Kafka, Apache Flink, Spark, Hive, Presto, Airflow-like orchestration, and internal metadata services.
- Some producers are low-maturity teams and cannot manually coordinate every schema change.
- Contract checks must add less than 5% compute overhead to existing pipelines.
- PII fields require explicit classification and deletion compatibility.
Your answer should explain the contract model, ownership model, enforcement architecture, and operational playbook.