A Google Research Scientist candidate is asked to reason through a realistic first-round ML problem rather than recite isolated theory. Build a lightweight classifier for YouTube comment spam detection and compare a classical ML baseline with a simple neural baseline, explaining the statistical and modeling tradeoffs clearly.
Use a historical moderation dataset from YouTube comments collected across popular channels. Each row is one comment with metadata available at prediction time.
| Feature Group | Count | Examples |
|---|---|---|
| Text | 1 raw field | comment_text |
| Numeric metadata | 6 | comment_length, url_count, emoji_count, uppercase_ratio, account_age_days, prior_flags |
| Categorical metadata | 3 | language, device_type, channel_topic |
| Temporal | 2 | hour_of_day, day_of_week |
A good solution should outperform a majority-class baseline and deliver strong ranking quality for moderation triage. Aim for AUC-ROC > 0.92, F1 > 0.78, and recall > 0.85 at precision >= 0.75 on a held-out test set.