Design Idempotent Message Delivery Scorer

Product Context

PulseChat is a global messaging platform used by consumers and small businesses. The infrastructure team wants an ML-assisted delivery system that decides whether an incoming delivery attempt is a true first-send, a safe retry, or a likely duplicate so the platform can preserve idempotent delivery across retries, client reconnects, and downstream failures.

Scale

Signal	Value
DAU	120M
Messages sent/day	9B
Peak delivery-attempt QPS	220K
Peak retry QPS during incidents	500K
Active conversation graph	2.5B user pairs / groups
Dedup / decision latency budget (p99)	35ms
Retention window for idempotency keys	7 days

Task

Design an end-to-end ML system that helps enforce idempotent message delivery at scale. Your design should address:

How you would frame the problem, define the prediction target, and separate deterministic idempotency guarantees from ML-based decisioning
The full architecture: online serving path, offline training path, feature store, feedback logging, and how retries flow through the system
A multi-stage decision pipeline (for example: fast retrieval of prior attempts/events → ranking/scoring duplicate likelihood → policy or re-ranking layer for final action)
Model choices for each stage, including what features are available at request time and how you avoid training-serving skew
Offline and online evaluation, including business metrics, safety guardrails, and how you would run a staged rollout
Failure modes such as feature drift, stale state, partial outages, replay storms, and incorrect suppression of legitimate messages

Constraints

The system must never rely on ML alone for correctness; deterministic keys and storage semantics are required for hard guarantees where possible
User-visible duplicate deliveries are very costly, but false suppression of legitimate messages is worse for trust and compliance
Some delivery metadata arrives late or out of order from clients and regional brokers
Data residency rules require EU user event logs to stay in-region
Cost target: average online decisioning cost below $0.00015 per delivery attempt
During regional outages, retry traffic can spike 2-3x and feature freshness may degrade

Product Context

Scale

Signal	Value
DAU	120M
Messages sent/day	9B
Peak delivery-attempt QPS	220K
Peak retry QPS during incidents	500K
Active conversation graph	2.5B user pairs / groups
Dedup / decision latency budget (p99)	35ms
Retention window for idempotency keys	7 days

Task

Design an end-to-end ML system that helps enforce idempotent message delivery at scale. Your design should address:

How you would frame the problem, define the prediction target, and separate deterministic idempotency guarantees from ML-based decisioning
The full architecture: online serving path, offline training path, feature store, feedback logging, and how retries flow through the system
A multi-stage decision pipeline (for example: fast retrieval of prior attempts/events → ranking/scoring duplicate likelihood → policy or re-ranking layer for final action)
Model choices for each stage, including what features are available at request time and how you avoid training-serving skew
Offline and online evaluation, including business metrics, safety guardrails, and how you would run a staged rollout
Failure modes such as feature drift, stale state, partial outages, replay storms, and incorrect suppression of legitimate messages

Constraints

The system must never rely on ML alone for correctness; deterministic keys and storage semantics are required for hard guarantees where possible
User-visible duplicate deliveries are very costly, but false suppression of legitimate messages is worse for trust and compliance
Some delivery metadata arrives late or out of order from clients and regional brokers
Data residency rules require EU user event logs to stay in-region
Cost target: average online decisioning cost below $0.00015 per delivery attempt
During regional outages, retry traffic can spike 2-3x and feature freshness may degrade

Product Context

Scale

Signal	Value
DAU	120M
Messages sent/day	9B
Peak delivery-attempt QPS	220K
Peak retry QPS during incidents	500K
Active conversation graph	2.5B user pairs / groups
Dedup / decision latency budget (p99)	35ms
Retention window for idempotency keys	7 days

Task

Design an end-to-end ML system that helps enforce idempotent message delivery at scale. Your design should address:

How you would frame the problem, define the prediction target, and separate deterministic idempotency guarantees from ML-based decisioning
The full architecture: online serving path, offline training path, feature store, feedback logging, and how retries flow through the system
A multi-stage decision pipeline (for example: fast retrieval of prior attempts/events → ranking/scoring duplicate likelihood → policy or re-ranking layer for final action)
Model choices for each stage, including what features are available at request time and how you avoid training-serving skew
Offline and online evaluation, including business metrics, safety guardrails, and how you would run a staged rollout
Failure modes such as feature drift, stale state, partial outages, replay storms, and incorrect suppression of legitimate messages

Constraints

The system must never rely on ML alone for correctness; deterministic keys and storage semantics are required for hard guarantees where possible
User-visible duplicate deliveries are very costly, but false suppression of legitimate messages is worse for trust and compliance
Some delivery metadata arrives late or out of order from clients and regional brokers
Data residency rules require EU user event logs to stay in-region
Cost target: average online decisioning cost below $0.00015 per delivery attempt
During regional outages, retry traffic can spike 2-3x and feature freshness may degrade

Product Context

Scale

Signal	Value
DAU	120M
Messages sent/day	9B
Peak delivery-attempt QPS	220K
Peak retry QPS during incidents	500K
Active conversation graph	2.5B user pairs / groups
Dedup / decision latency budget (p99)	35ms
Retention window for idempotency keys	7 days

Task

Design an end-to-end ML system that helps enforce idempotent message delivery at scale. Your design should address:

How you would frame the problem, define the prediction target, and separate deterministic idempotency guarantees from ML-based decisioning
The full architecture: online serving path, offline training path, feature store, feedback logging, and how retries flow through the system
A multi-stage decision pipeline (for example: fast retrieval of prior attempts/events → ranking/scoring duplicate likelihood → policy or re-ranking layer for final action)
Model choices for each stage, including what features are available at request time and how you avoid training-serving skew
Offline and online evaluation, including business metrics, safety guardrails, and how you would run a staged rollout
Failure modes such as feature drift, stale state, partial outages, replay storms, and incorrect suppression of legitimate messages

Constraints

The system must never rely on ML alone for correctness; deterministic keys and storage semantics are required for hard guarantees where possible
User-visible duplicate deliveries are very costly, but false suppression of legitimate messages is worse for trust and compliance
Some delivery metadata arrives late or out of order from clients and regional brokers
Data residency rules require EU user event logs to stay in-region
Cost target: average online decisioning cost below $0.00015 per delivery attempt
During regional outages, retry traffic can spike 2-3x and feature freshness may degrade

Interview Guides

Product Context

Scale

Task

Constraints

Design Idempotent Message Delivery Scorer

Product Context

Scale

Task

Constraints

Your Answer

Design Idempotent Message Delivery Scorer

Product Context

Scale

Task

Constraints

Design Idempotent Message Delivery Scorer

Product Context

Scale

Task

Constraints

Your Answer