Business Context
Meta is training a deeper neural network for Facebook Feed ranking to capture higher-order interactions across user, content, and context features. The current prototype underperforms a shallower baseline because gradients in early layers collapse during training, slowing convergence and reducing ranking quality.
Dataset
You are given an offline supervised learning dataset built from Feed impression logs.
| Feature Group | Count | Examples |
|---|
| Dense numerical | 28 | prior CTR, dwell time stats, friend interaction counts, session depth |
| Categorical IDs | 14 | user_id bucket, author_id bucket, content_type, device_type, country |
| Temporal/context | 9 | hour_of_day, day_of_week, recency_since_last_open, network_type |
| Aggregated embedding inputs | 6 | historical author affinity, topic affinity, content embedding clusters |
- Size: 12.4M impressions, 57 input features
- Target: Binary label indicating whether the impression received a meaningful engagement within 24 hours
- Class balance: 11.6% positive, 88.4% negative
- Missing data: 6% missing in some engagement aggregates for new users; sparse categorical coverage for tail creators
Success Criteria
A good solution should:
- Improve validation AUC-ROC by at least 0.015 over a plain deep MLP baseline without stabilization
- Keep training numerically stable for 20+ epochs without early gradient collapse
- Achieve p95 online inference < 15 ms per request after export
Constraints
- The model must remain deployable in a low-latency ranking stack
- You should explain which interventions specifically address vanishing gradients
- The solution should be reproducible and monitorable in production
Deliverables
- Build a deep neural network baseline and a stabilized version.
- Show how you detect vanishing gradients during training.
- Apply and justify mitigation techniques such as residual connections, normalization, activation choice, and initialization.
- Evaluate both models on held-out data using ranking-relevant metrics.
- Recommend a production-ready training and refresh strategy.