Business Context
Meta needs a real-time classifier to detect hate speech across Facebook and Instagram posts at global scale. The system must score billions of daily text posts quickly enough to support automated enforcement, ranking downweighting, and human review queues.
Dataset
You are given a historical training set built from policy-reviewed content, user reports, and sampled public posts.
| Feature Group | Count | Examples |
|---|
| Raw text | 1 | post_text, caption_text, OCR_extracted_text |
| Language metadata | 4 | locale, detected_language, script, translation_available |
| Author/account signals | 8 | account_age_days, prior_violations_90d, follower_count_bucket |
| Content context | 6 | surface, reshare_depth, media_present, group_type |
| Safety heuristics | 10 | slur_lexicon_hits, profanity_score, repeated_character_ratio |
- Size: 420M labeled posts collected over 18 months, 29 structured features plus text
- Target: Binary — hate speech policy violation (1) vs non-violating content (0)
- Class balance: Highly imbalanced — 0.35% positive, 99.65% negative
- Missing data: 12% missing OCR text, 7% missing account-history features for new accounts, sparse coverage for low-resource languages
Success Criteria
A good solution should achieve high recall on violating content while keeping precision high enough to avoid overwhelming reviewers and minimizing false positives on benign reclaimed language or quoted speech. Target online inference latency is p95 < 40 ms per post for the first-stage model.
Constraints
- Must support multilingual content and distribution shift across new slurs/evasion patterns
- Must operate at billions of posts/day with strict serving cost limits
- Must provide enough interpretability for policy audits and error analysis
- Must degrade safely when language ID or OCR is missing
Deliverables
- Design a supervised ML pipeline for real-time hate speech detection on Meta surfaces.
- Choose model(s), justify them, and explain how you would handle extreme class imbalance.
- Define preprocessing and feature engineering for text plus structured metadata.
- Propose a train/validation/test strategy that avoids temporal leakage.
- Specify evaluation metrics, decision thresholds, and moderation-review tradeoffs.
- Provide production-quality Python code for training and offline evaluation of a first-stage classifier.