You're setting up observability around data refreshes and data health so issues are caught quickly instead of being discovered in dashboards or reports. You want a pipeline that can detect both operational failures and unusual metric behavior, then route alerts to the right people.
How would you design an alerting pipeline for refresh failures and data anomalies?