Real-Time Fraudulent Click Filtering

Product Context

AdStream is a large ad network serving sponsored search and display ads across publisher apps and websites. You need to design an ML system that detects and filters fraudulent ad clicks in real time before they are billed to advertisers or used for downstream optimization.

Scale

Signal	Value
DAU	120M users
Peak ad impression QPS	900K
Peak click-event QPS	85K
Advertisers	1.8M active
Publishers	220K active
Devices / browsers seen per day	300M+
End-to-end decision latency budget	50ms p99
Historical labeled events	~30B clicks over 180 days

Task

Clarify the product goal and define what counts as fraudulent vs suspicious vs unknown traffic.
Design the end-to-end architecture for real-time scoring, filtering, and post-hoc investigation.
Propose a multi-stage decision system (fast rules / retrieval / ranking / re-review) and justify model choices at each stage.
Explain the data pipeline, labels, feature store design, and how you avoid training-serving skew.
Define offline and online evaluation, including delayed labels, false-positive cost, and advertiser trust guardrails.
Identify major failure modes, monitoring, and rollback strategies.

Constraints

Fraud labels are delayed and noisy: chargebacks, manual reviews, and advertiser complaints may arrive days later.
False positives are expensive because blocking legitimate clicks hurts publisher revenue and advertiser delivery.
Some features must be computed in streaming fashion (IP velocity, device fingerprint frequency, publisher-level anomaly rates).
The system must support regional compliance requirements; raw IPs and user identifiers may have retention limits.
Cost matters: the highest-volume path should run primarily on CPU, with minimal per-event state lookups.
The platform must continue serving ads even if the ML scorer is degraded; define safe fallbacks.

Signal

Value

DAU

120M users

Peak ad impression QPS

900K

Peak click-event QPS

85K

Advertisers

1.8M active

Publishers

220K active

Devices / browsers seen per day

300M+

End-to-end decision latency budget

50ms p99

Historical labeled events

~30B clicks over 180 days

Task

Clarify the product goal and define what counts as fraudulent vs suspicious vs unknown traffic.

Design the end-to-end architecture for real-time scoring, filtering, and post-hoc investigation.

Propose a multi-stage decision system (fast rules / retrieval / ranking / re-review) and justify model choices at each stage.

Explain the data pipeline, labels, feature store design, and how you avoid training-serving skew.

Define offline and online evaluation, including delayed labels, false-positive cost, and advertiser trust guardrails.

Identify major failure modes, monitoring, and rollback strategies.

Constraints

Fraud labels are delayed and noisy: chargebacks, manual reviews, and advertiser complaints may arrive days later.

False positives are expensive because blocking legitimate clicks hurts publisher revenue and advertiser delivery.

Some features must be computed in streaming fashion (IP velocity, device fingerprint frequency, publisher-level anomaly rates).

The system must support regional compliance requirements; raw IPs and user identifiers may have retention limits.

Cost matters: the highest-volume path should run primarily on CPU, with minimal per-event state lookups.

The platform must continue serving ads even if the ML scorer is degraded; define safe fallbacks.

Signal

Value

DAU

120M users

Peak ad impression QPS

900K

Peak click-event QPS

85K

Advertisers

1.8M active

Publishers

220K active

Devices / browsers seen per day

300M+

End-to-end decision latency budget

50ms p99

Historical labeled events

~30B clicks over 180 days

Task

Clarify the product goal and define what counts as fraudulent vs suspicious vs unknown traffic.

Design the end-to-end architecture for real-time scoring, filtering, and post-hoc investigation.

Propose a multi-stage decision system (fast rules / retrieval / ranking / re-review) and justify model choices at each stage.

Explain the data pipeline, labels, feature store design, and how you avoid training-serving skew.

Define offline and online evaluation, including delayed labels, false-positive cost, and advertiser trust guardrails.

Identify major failure modes, monitoring, and rollback strategies.

Constraints

Fraud labels are delayed and noisy: chargebacks, manual reviews, and advertiser complaints may arrive days later.

False positives are expensive because blocking legitimate clicks hurts publisher revenue and advertiser delivery.

Some features must be computed in streaming fashion (IP velocity, device fingerprint frequency, publisher-level anomaly rates).

The system must support regional compliance requirements; raw IPs and user identifiers may have retention limits.

Cost matters: the highest-volume path should run primarily on CPU, with minimal per-event state lookups.

The platform must continue serving ads even if the ML scorer is degraded; define safe fallbacks.

Signal

Value

DAU

120M users

Peak ad impression QPS

900K

Peak click-event QPS

85K

Advertisers

1.8M active

Publishers

220K active

Devices / browsers seen per day

300M+

End-to-end decision latency budget

50ms p99

Historical labeled events

~30B clicks over 180 days

Task

Clarify the product goal and define what counts as fraudulent vs suspicious vs unknown traffic.

Design the end-to-end architecture for real-time scoring, filtering, and post-hoc investigation.

Propose a multi-stage decision system (fast rules / retrieval / ranking / re-review) and justify model choices at each stage.

Explain the data pipeline, labels, feature store design, and how you avoid training-serving skew.

Define offline and online evaluation, including delayed labels, false-positive cost, and advertiser trust guardrails.

Identify major failure modes, monitoring, and rollback strategies.

Constraints

Fraud labels are delayed and noisy: chargebacks, manual reviews, and advertiser complaints may arrive days later.

False positives are expensive because blocking legitimate clicks hurts publisher revenue and advertiser delivery.

Some features must be computed in streaming fashion (IP velocity, device fingerprint frequency, publisher-level anomaly rates).

The system must support regional compliance requirements; raw IPs and user identifiers may have retention limits.

Cost matters: the highest-volume path should run primarily on CPU, with minimal per-event state lookups.

The platform must continue serving ads even if the ML scorer is degraded; define safe fallbacks.

Interview Guides

Product Context

Scale

Task

Constraints

Real-Time Fraudulent Click Filtering

Product Context

Scale

Task

Constraints

Your Answer

Real-Time Fraudulent Click Filtering

Product Context

Scale

Task

Constraints

Real-Time Fraudulent Click Filtering

Product Context

Scale

Task

Constraints

Your Answer