Context
MetricFlow, a B2B SaaS company, builds executive dashboards in Tableau on top of Snowflake. Today, product and finance metrics are loaded through nightly ELT pipelines orchestrated by Apache Airflow, but dashboard validation is mostly manual and inconsistent, causing metric mismatches before launches.
You need to design a production-ready validation pipeline that verifies dashboard data accuracy before a new dashboard or metric goes live. The solution should validate source-to-dashboard consistency, freshness, completeness, and business logic correctness, while fitting into the existing Snowflake + dbt + Airflow stack.
Scale Requirements
- Sources: PostgreSQL OLTP, Stripe exports in S3, Segment event logs in Kafka
- Volume: 250M event rows/day, 40M transactional rows/day, 12 TB warehouse data
- Dashboards: 180 Tableau dashboards, 1,200 metric tiles
- Validation window: complete pre-release validation in < 20 minutes
- Freshness SLA: dashboard tables must be < 2 hours old
- Historical checks: compare against 90 days of prior metric values
Requirements
- Design a validation pipeline that runs automatically before dashboard publication.
- Validate schema, row-count completeness, null rates, duplicate keys, and referential integrity across raw, staging, and mart layers.
- Reconcile critical business metrics (for example: revenue, active users, conversion) between source systems, dbt models, and final dashboard queries.
- Detect anomalies versus historical baselines and block release when thresholds are exceeded.
- Support idempotent reruns, backfills for historical dashboard versions, and auditable validation results.
- Expose validation status to analysts and trigger alerts for failures.
Constraints
- Existing stack must remain: Snowflake, dbt Core, Airflow 2.x, Tableau.
- Team size is 3 data engineers and 2 analytics engineers.
- Budget allows only modest incremental compute; avoid always-on clusters.
- Finance dashboards are SOX-relevant, so validation logs must be retained for 1 year.
- No manual sign-off should be required for low-risk dashboard updates if all checks pass.