Design Hate-Speech Comment Detection

Product Context

Meta wants to detect and act on hateful text comments across Facebook and Instagram. The system should score comments in real time for enforcement and also support downstream human review, user reporting, and policy analytics.

Scale

Signal	Value
DAU impacted	1.8B users across Facebook + Instagram
New text comments/day	9B
Peak comment creation QPS	220K
Peak comment-view QPS needing precomputed safety scores	1.2M
Supported languages	120+
p99 latency budget for write-time decision	120ms end-to-end
Human review queue capacity	~8M comments/day

Task

Design an end-to-end ML system for hate-speech detection on text comments. Your design should address:

How you define the prediction target, policy tiers, and product actions (allow, downrank, send to review, remove).
The full architecture from data collection and labeling to online serving, including a multi-stage pipeline rather than a single model.
Model choices for fast filtering, ranking, and high-precision re-scoring under strict latency and cost constraints.
Offline and online evaluation, including how you handle delayed labels, policy changes, and multilingual performance.
Monitoring, failure modes, and rollback plans, especially for feature drift, training-serving skew, adversarial evasion, and fairness across languages/dialects.

Constraints

False positives are costly: incorrect removals harm user trust and creator experience.
False negatives are also costly: missed hate speech creates safety and regulatory risk.
Some actions must happen synchronously at comment creation; others can be asynchronous within minutes.
Labels are noisy and partially delayed: user reports, reviewer decisions, and appeals may arrive hours or days later.
The system must support policy updates without requiring a full retrain for every threshold change.
Raw text retention is limited in some regions, so feature logging and compliance constraints matter.

Signal

Value

DAU impacted

1.8B users across Facebook + Instagram

New text comments/day

Peak comment creation QPS

220K

Peak comment-view QPS needing precomputed safety scores

1.2M

Supported languages

120+

p99 latency budget for write-time decision

120ms end-to-end

Human review queue capacity

~8M comments/day

Task

Design an end-to-end ML system for hate-speech detection on text comments. Your design should address:

How you define the prediction target, policy tiers, and product actions (allow, downrank, send to review, remove).

The full architecture from data collection and labeling to online serving, including a multi-stage pipeline rather than a single model.

Model choices for fast filtering, ranking, and high-precision re-scoring under strict latency and cost constraints.

Offline and online evaluation, including how you handle delayed labels, policy changes, and multilingual performance.

Monitoring, failure modes, and rollback plans, especially for feature drift, training-serving skew, adversarial evasion, and fairness across languages/dialects.

Constraints

False positives are costly: incorrect removals harm user trust and creator experience.

False negatives are also costly: missed hate speech creates safety and regulatory risk.

Some actions must happen synchronously at comment creation; others can be asynchronous within minutes.

Labels are noisy and partially delayed: user reports, reviewer decisions, and appeals may arrive hours or days later.

The system must support policy updates without requiring a full retrain for every threshold change.

Raw text retention is limited in some regions, so feature logging and compliance constraints matter.

Signal

Value

DAU impacted

1.8B users across Facebook + Instagram

New text comments/day

Peak comment creation QPS

220K

Peak comment-view QPS needing precomputed safety scores

1.2M

Supported languages

120+

p99 latency budget for write-time decision

120ms end-to-end

Human review queue capacity

~8M comments/day

Task

Design an end-to-end ML system for hate-speech detection on text comments. Your design should address:

How you define the prediction target, policy tiers, and product actions (allow, downrank, send to review, remove).

The full architecture from data collection and labeling to online serving, including a multi-stage pipeline rather than a single model.

Model choices for fast filtering, ranking, and high-precision re-scoring under strict latency and cost constraints.

Offline and online evaluation, including how you handle delayed labels, policy changes, and multilingual performance.

Monitoring, failure modes, and rollback plans, especially for feature drift, training-serving skew, adversarial evasion, and fairness across languages/dialects.

Constraints

False positives are costly: incorrect removals harm user trust and creator experience.

False negatives are also costly: missed hate speech creates safety and regulatory risk.

Some actions must happen synchronously at comment creation; others can be asynchronous within minutes.

Labels are noisy and partially delayed: user reports, reviewer decisions, and appeals may arrive hours or days later.

The system must support policy updates without requiring a full retrain for every threshold change.

Raw text retention is limited in some regions, so feature logging and compliance constraints matter.

Signal

Value

DAU impacted

1.8B users across Facebook + Instagram

New text comments/day

Peak comment creation QPS

220K

Peak comment-view QPS needing precomputed safety scores

1.2M

Supported languages

120+

p99 latency budget for write-time decision

120ms end-to-end

Human review queue capacity

~8M comments/day

Task

Design an end-to-end ML system for hate-speech detection on text comments. Your design should address:

How you define the prediction target, policy tiers, and product actions (allow, downrank, send to review, remove).

The full architecture from data collection and labeling to online serving, including a multi-stage pipeline rather than a single model.

Model choices for fast filtering, ranking, and high-precision re-scoring under strict latency and cost constraints.

Offline and online evaluation, including how you handle delayed labels, policy changes, and multilingual performance.

Monitoring, failure modes, and rollback plans, especially for feature drift, training-serving skew, adversarial evasion, and fairness across languages/dialects.

Constraints

False positives are costly: incorrect removals harm user trust and creator experience.

False negatives are also costly: missed hate speech creates safety and regulatory risk.

Some actions must happen synchronously at comment creation; others can be asynchronous within minutes.

Labels are noisy and partially delayed: user reports, reviewer decisions, and appeals may arrive hours or days later.

The system must support policy updates without requiring a full retrain for every threshold change.

Raw text retention is limited in some regions, so feature logging and compliance constraints matter.

Interview Guides

Product Context

Scale

Task

Constraints

Design Hate-Speech Comment Detection

Product Context

Scale

Task

Constraints

Your Answer

Design Hate-Speech Comment Detection

Product Context

Scale

Task

Constraints

Design Hate-Speech Comment Detection

Product Context

Scale

Task

Constraints

Your Answer