Build Reliable Dashboard Refresh Pipeline

Scenario

You are responsible for the data pipeline that powers customer-facing operational dashboards for a utility analytics product. The current workflow refreshes visualizations after upstream meter, outage, and billing data lands, but users regularly see mismatched totals between dashboards and exported reports after reruns or late-arriving corrections. An internal audit has flagged the process as non-repeatable because the same reporting date can produce different results depending on job timing and partial failures. You need to redesign the workflow so dashboard updates are reliable, repeatable, and easy to recover without manual intervention.

Current State

Component	Status / Technology
Source systems	AMI meter reads, outage events, billing extracts, asset reference data
Landing zone	Azure Data Lake Storage Gen2, mixed CSV/JSON/Parquet drops
Transformations	Azure Databricks notebooks with ad hoc merge logic
Warehouse	Snowflake serving curated reporting tables
Semantic layer	Itron Analytics dashboards reading Snowflake views
Orchestration	Apache Airflow 2.x with daily and hourly DAGs

Scale: ~1.2 TB/day across 4 source domains, 180M meter-read records/day, hourly outage updates, daily billing corrections, 15-minute dashboard freshness target, 2 years of retained history, and recurring backfills for corrected utility data.

Question

How would you redesign this pipeline so each dashboard refresh is deterministic, idempotent, and auditable even when upstream files arrive late, are corrected, or need to be replayed? Explain the end-to-end workflow you would build from ingestion through transformation, validation, publication, and rollback for the visualization layer.

Scenario

Current State

Component	Status / Technology
Source systems	AMI meter reads, outage events, billing extracts, asset reference data
Landing zone	Azure Data Lake Storage Gen2, mixed CSV/JSON/Parquet drops
Transformations	Azure Databricks notebooks with ad hoc merge logic
Warehouse	Snowflake serving curated reporting tables
Semantic layer	Itron Analytics dashboards reading Snowflake views
Orchestration	Apache Airflow 2.x with daily and hourly DAGs

Scale: ~1.2 TB/day across 4 source domains, 180M meter-read records/day, hourly outage updates, daily billing corrections, 15-minute dashboard freshness target, 2 years of retained history, and recurring backfills for corrected utility data.

Question

Scenario

Current State

Component	Status / Technology
Source systems	AMI meter reads, outage events, billing extracts, asset reference data
Landing zone	Azure Data Lake Storage Gen2, mixed CSV/JSON/Parquet drops
Transformations	Azure Databricks notebooks with ad hoc merge logic
Warehouse	Snowflake serving curated reporting tables
Semantic layer	Itron Analytics dashboards reading Snowflake views
Orchestration	Apache Airflow 2.x with daily and hourly DAGs

Scale: ~1.2 TB/day across 4 source domains, 180M meter-read records/day, hourly outage updates, daily billing corrections, 15-minute dashboard freshness target, 2 years of retained history, and recurring backfills for corrected utility data.

Question

Scenario

Current State

Component	Status / Technology
Source systems	AMI meter reads, outage events, billing extracts, asset reference data
Landing zone	Azure Data Lake Storage Gen2, mixed CSV/JSON/Parquet drops
Transformations	Azure Databricks notebooks with ad hoc merge logic
Warehouse	Snowflake serving curated reporting tables
Semantic layer	Itron Analytics dashboards reading Snowflake views
Orchestration	Apache Airflow 2.x with daily and hourly DAGs

Scale: ~1.2 TB/day across 4 source domains, 180M meter-read records/day, hourly outage updates, daily billing corrections, 15-minute dashboard freshness target, 2 years of retained history, and recurring backfills for corrected utility data.

Interview Guides

Scenario

Current State

Question

Build Reliable Dashboard Refresh Pipeline

Scenario

Current State

Question

Your Answer

Build Reliable Dashboard Refresh Pipeline

Scenario

Current State

Question

Build Reliable Dashboard Refresh Pipeline

Scenario

Current State

Question

Your Answer