Diagnose Overfitting in Churn Model

Scenario

You own a gradient-boosted tree model that predicts 30-day churn for a subscription product and triggers retention offers for users with scores above 0.60. The model was retrained last week after adding dozens of behavioral features from recent product usage logs, and the training team is excited because offline training performance improved sharply. However, when you evaluated the same model on validation and untouched holdout data, performance dropped meaningfully, and finance is worried the retention budget is being spent on the wrong users. You need to assess whether the model is overfitting and decide what to do before rollout.

Performance Data

Metric	Training	Validation	Holdout Test
Accuracy	0.94	0.81	0.79
Precision	0.91	0.68	0.65
Recall	0.88	0.59	0.56
F1 Score	0.89	0.63	0.60
AUC-ROC	0.97	0.76	0.74
Log Loss	0.18	0.49	0.53
Positive prediction rate	14%	11%	10%

Question

How would you evaluate whether this model is overfitting, and what would you recommend before deploying it to drive retention actions?

Scenario

Metric

Training

Validation

Holdout Test

Accuracy

0.94

0.81

0.79

Precision

0.91

0.68

0.65

Recall

0.88

0.59

0.56

F1 Score

0.89

0.63

0.60

AUC-ROC

0.97

0.76

0.74

Log Loss

0.18

0.49

0.53

Positive prediction rate

14%

11%

10%

Scenario

Metric

Training

Validation

Holdout Test

Accuracy

0.94

0.81

0.79

Precision

0.91

0.68

0.65

Recall

0.88

0.59

0.56

F1 Score

0.89

0.63

0.60

AUC-ROC

0.97

0.76

0.74

Log Loss

0.18

0.49

0.53

Positive prediction rate

14%

11%

10%

Scenario

Metric

Training

Validation

Holdout Test

Accuracy

0.94

0.81

0.79

Precision

0.91

0.68

0.65

Recall

0.88

0.59

0.56

F1 Score

0.89

0.63

0.60

AUC-ROC

0.97

0.76

0.74

Log Loss

0.18

0.49

0.53

Positive prediction rate

14%

11%

10%

Interview Guides

Scenario

Performance Data

Question

Diagnose Overfitting in Churn Model

Scenario

Performance Data

Question

Your Answer

Diagnose Overfitting in Churn Model

Scenario

Performance Data

Question

Diagnose Overfitting in Churn Model

Scenario

Performance Data

Question

Your Answer