ArenaForge runs a competitive multiplayer game with 12M monthly players. The gameplay analytics team built a model on player telemetry to predict whether a player will churn in the next 7 days, but offline results look much better than live performance, suggesting overfitting.
You are given a supervised learning dataset built from player-session telemetry aggregated to the player-day level.
| Feature Group | Count | Examples |
|---|---|---|
| Session activity | 12 | sessions_per_day, avg_session_length, matches_played, time_in_lobby |
| Performance stats | 10 | win_rate_7d, kills_per_match, deaths_per_match, damage_per_minute |
| Progression | 8 | level, xp_gained_7d, battle_pass_tier, days_since_last_unlock |
| Economy & social | 6 | currency_spent_7d, purchases_30d, party_play_ratio, friends_online_avg |
| Device & region | 5 | platform, region, device_tier, network_quality_bucket |
| Temporal | 4 | day_of_week, season_week, days_since_install, days_since_last_login |
A strong solution should clearly diagnose whether the model is overfitting, quantify the train/validation/test gap, identify likely causes such as leakage or excessive complexity, and recommend fixes that improve generalization. A good target is reducing the train-test AUC gap to < 0.03 while maintaining test ROC-AUC e 0.78.