You are designing a streaming pipeline where some events arrive after their event time. Those late records can invalidate previously computed windows, aggregates, or fact tables. The goal is to keep downstream data correct while limiting unnecessary recomputation.
Event-time processing and watermark designIdempotent writes and deduplicationRepair strategy for very late recordsData quality controls for corrected outputs