You’re building the message rendering pipeline for a large consumer chat app (millions of DAUs). Mobile networks can deliver events out of order, with duplicates, and sometimes with missing sequence numbers due to packet loss. The UI must render messages in a stable order without stalling forever.
Each incoming event is a tuple (msg_id, seq, text):
msg_id is a globally unique identifier for a logical message (retries may resend the same msg_id).seq is a server-assigned sequence number intended to define the display order.text is the message payload.You will ingest events one-by-one. After each ingest, return the list of events that can now be emitted (rendered) immediately.
msg_id: if a msg_id was seen before, ignore the event.seq: if multiple different msg_ids arrive with the same seq, keep only the first one ever seen for that seq.expected_seq (start at 1).expected_seq is present in the buffer, emit it and increment expected_seq.smallest_buffered be the smallest seq currently buffered. If smallest_buffered > expected_seq + max_gap, treat expected_seq..smallest_buffered-1 as lost, set expected_seq = smallest_buffered, and continue draining.Implement:
def dedupe_and_emit(events: list[tuple[str,int,str]], max_gap: int) -> list[list[tuple[str,int,str]]]
It should return a list where the i-th element is the list of events emitted immediately after ingesting events[i].
Example 1
max_gap = 2, events = [("a",1,"hi"),("b",3,"are you there?"),("c",2,"hello")][[ ("a",1,"hi") ], [], [ ("c",2,"hello"), ("b",3,"are you there?") ]]Example 2
max_gap = 1, events = [("x",1,"m1"),("y",4,"m4")][[ ("x",1,"m1") ], [ ("y",4,"m4") ]]expected_seq=2 and 4 > 2+1, so we skip 2..3 as lost and emit 4.max_gap can be 0, meaning you may skip forward immediately when the next available buffered seq is greater than expected_seq.