You are supporting a data pipeline that moves client activity data from ingestion through transformation and into downstream reporting. A client reports missing records and delayed updates, but the failure is not obvious from the application layer. You need to use logs and monitoring signals to trace the issue back to the failing component.
How do you use logs and monitoring tools like Splunk or Datadog to isolate the root cause of a client issue?