You are supporting a customer-facing data pipeline and need to determine why the data in the destination looks wrong. The issue could come from the source system, or it could be introduced during ingestion, transformation, or loading.
How would you investigate whether a customer's data issue is caused by bad source data or a platform problem?
Source extract payloads and timestampsRaw landing data before transformationTransformation logic and schema mappingsLoad history, retries, and duplicate handlingRow-count and checksum reconciliation by run