Diagnose Telemetry Model Overfitting

Business Context

ArenaForge runs a competitive multiplayer game with 12M monthly players. The gameplay analytics team built a model on player telemetry to predict whether a player will churn in the next 7 days, but offline results look much better than live performance, suggesting overfitting.

Dataset

You are given a supervised learning dataset built from player-session telemetry aggregated to the player-day level.

Feature Group	Count	Examples
Session activity	12	sessions_per_day, avg_session_length, matches_played, time_in_lobby
Performance stats	10	win_rate_7d, kills_per_match, deaths_per_match, damage_per_minute
Progression	8	level, xp_gained_7d, battle_pass_tier, days_since_last_unlock
Economy & social	6	currency_spent_7d, purchases_30d, party_play_ratio, friends_online_avg
Device & region	5	platform, region, device_tier, network_quality_bucket
Temporal	4	day_of_week, season_week, days_since_install, days_since_last_login

Rows: 420K player-days across 9 months
Target: Binary label indicating churn within the next 7 days
Class balance: 18% positive, 82% negative
Missing data: 6% missing in network/device fields, 9% missing in economy features for non-spenders

Success Criteria

A strong solution should clearly diagnose whether the model is overfitting, quantify the train/validation/test gap, identify likely causes such as leakage or excessive complexity, and recommend fixes that improve generalization. A good target is reducing the train-test AUC gap to < 0.03 while maintaining test ROC-AUC e 0.78.

Constraints

Predictions are generated daily in batch for up to 2M active players
The analytics team needs interpretable diagnostics, not just a better model
Avoid temporal leakage: future behavior cannot influence current predictions

Deliverables

Define a validation strategy to detect overfitting on telemetry data.
Train at least one baseline model and one higher-capacity model.
Compare train, validation, and test performance using appropriate metrics.
Diagnose likely overfitting causes and propose mitigation steps.
Provide code to reproduce preprocessing, training, and evaluation.

Business Context

Dataset

You are given a supervised learning dataset built from player-session telemetry aggregated to the player-day level.

Feature Group	Count	Examples
Session activity	12	sessions_per_day, avg_session_length, matches_played, time_in_lobby
Performance stats	10	win_rate_7d, kills_per_match, deaths_per_match, damage_per_minute
Progression	8	level, xp_gained_7d, battle_pass_tier, days_since_last_unlock
Economy & social	6	currency_spent_7d, purchases_30d, party_play_ratio, friends_online_avg
Device & region	5	platform, region, device_tier, network_quality_bucket
Temporal	4	day_of_week, season_week, days_since_install, days_since_last_login

Rows: 420K player-days across 9 months
Target: Binary label indicating churn within the next 7 days
Class balance: 18% positive, 82% negative
Missing data: 6% missing in network/device fields, 9% missing in economy features for non-spenders

Success Criteria

Constraints

Predictions are generated daily in batch for up to 2M active players
The analytics team needs interpretable diagnostics, not just a better model
Avoid temporal leakage: future behavior cannot influence current predictions

Deliverables

Define a validation strategy to detect overfitting on telemetry data.
Train at least one baseline model and one higher-capacity model.
Compare train, validation, and test performance using appropriate metrics.
Diagnose likely overfitting causes and propose mitigation steps.
Provide code to reproduce preprocessing, training, and evaluation.

Business Context

Dataset

You are given a supervised learning dataset built from player-session telemetry aggregated to the player-day level.

Feature Group	Count	Examples
Session activity	12	sessions_per_day, avg_session_length, matches_played, time_in_lobby
Performance stats	10	win_rate_7d, kills_per_match, deaths_per_match, damage_per_minute
Progression	8	level, xp_gained_7d, battle_pass_tier, days_since_last_unlock
Economy & social	6	currency_spent_7d, purchases_30d, party_play_ratio, friends_online_avg
Device & region	5	platform, region, device_tier, network_quality_bucket
Temporal	4	day_of_week, season_week, days_since_install, days_since_last_login

Rows: 420K player-days across 9 months
Target: Binary label indicating churn within the next 7 days
Class balance: 18% positive, 82% negative
Missing data: 6% missing in network/device fields, 9% missing in economy features for non-spenders

Success Criteria

Constraints

Predictions are generated daily in batch for up to 2M active players
The analytics team needs interpretable diagnostics, not just a better model
Avoid temporal leakage: future behavior cannot influence current predictions

Deliverables

Define a validation strategy to detect overfitting on telemetry data.
Train at least one baseline model and one higher-capacity model.
Compare train, validation, and test performance using appropriate metrics.
Diagnose likely overfitting causes and propose mitigation steps.
Provide code to reproduce preprocessing, training, and evaluation.

Business Context

Dataset

You are given a supervised learning dataset built from player-session telemetry aggregated to the player-day level.

Feature Group	Count	Examples
Session activity	12	sessions_per_day, avg_session_length, matches_played, time_in_lobby
Performance stats	10	win_rate_7d, kills_per_match, deaths_per_match, damage_per_minute
Progression	8	level, xp_gained_7d, battle_pass_tier, days_since_last_unlock
Economy & social	6	currency_spent_7d, purchases_30d, party_play_ratio, friends_online_avg
Device & region	5	platform, region, device_tier, network_quality_bucket
Temporal	4	day_of_week, season_week, days_since_install, days_since_last_login

Rows: 420K player-days across 9 months
Target: Binary label indicating churn within the next 7 days
Class balance: 18% positive, 82% negative
Missing data: 6% missing in network/device fields, 9% missing in economy features for non-spenders

Success Criteria

Constraints

Predictions are generated daily in batch for up to 2M active players
The analytics team needs interpretable diagnostics, not just a better model
Avoid temporal leakage: future behavior cannot influence current predictions

Deliverables

Define a validation strategy to detect overfitting on telemetry data.
Train at least one baseline model and one higher-capacity model.
Compare train, validation, and test performance using appropriate metrics.
Diagnose likely overfitting causes and propose mitigation steps.
Provide code to reproduce preprocessing, training, and evaluation.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Diagnose Telemetry Model Overfitting

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Diagnose Telemetry Model Overfitting

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Diagnose Telemetry Model Overfitting

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer