You're designing a streaming pipeline and need a plan for records that arrive after their expected event time. Some downstream tables are windowed or aggregated, so late data can change previously computed results. You want a design that keeps outputs correct without constantly reprocessing everything.
How would you design a system to handle late-arriving data in a streaming pipeline?