Diagnose High-Variance Churn Model

Context

StreamBox uses a gradient boosted classifier to predict which subscribers are likely to cancel within 30 days so the retention team can send targeted offers. The team suspects the model is suffering from high variance because offline training performance is much stronger than validation and test performance.

Current Performance

Metric	Training	Validation	Test
Accuracy	0.94	0.81	0.80
Precision	0.91	0.68	0.66
Recall	0.88	0.57	0.55
F1 Score	0.89	0.62	0.60
AUC-ROC	0.97	0.76	0.74
Log Loss	0.18	0.49	0.53

Additional details:

Training set: 1.2M users, churn rate 14%
Validation set: 150K users, churn rate 15%
Test set: 150K users, churn rate 15%
Model depth: 12, trees: 800, min samples per leaf: 2
Features: 220 behavioral, billing, and support features

The Problem

The VP of Growth wants to know whether the issue is truly overfitting, how to confirm it systematically, and what changes should be made before the next model release.

Requirements

Explain whether the metric pattern is consistent with high variance and why.
Describe a step-by-step diagnostic plan to confirm the root cause.
Identify which additional slices, plots, or validation checks you would run.
Recommend specific model or data changes to reduce variance.
Explain how you would measure whether the revised model is actually better.

Constraints

Retention team can contact at most 25,000 users per week.
False positives create discount cost and customer annoyance.
Retraining must fit within a weekly batch pipeline.

Context

Current Performance

Metric	Training	Validation	Test
Accuracy	0.94	0.81	0.80
Precision	0.91	0.68	0.66
Recall	0.88	0.57	0.55
F1 Score	0.89	0.62	0.60
AUC-ROC	0.97	0.76	0.74
Log Loss	0.18	0.49	0.53

Additional details:

Training set: 1.2M users, churn rate 14%
Validation set: 150K users, churn rate 15%
Test set: 150K users, churn rate 15%
Model depth: 12, trees: 800, min samples per leaf: 2
Features: 220 behavioral, billing, and support features

The Problem

The VP of Growth wants to know whether the issue is truly overfitting, how to confirm it systematically, and what changes should be made before the next model release.

Requirements

Explain whether the metric pattern is consistent with high variance and why.
Describe a step-by-step diagnostic plan to confirm the root cause.
Identify which additional slices, plots, or validation checks you would run.
Recommend specific model or data changes to reduce variance.
Explain how you would measure whether the revised model is actually better.

Constraints

Retention team can contact at most 25,000 users per week.
False positives create discount cost and customer annoyance.
Retraining must fit within a weekly batch pipeline.

Context

Current Performance

Metric	Training	Validation	Test
Accuracy	0.94	0.81	0.80
Precision	0.91	0.68	0.66
Recall	0.88	0.57	0.55
F1 Score	0.89	0.62	0.60
AUC-ROC	0.97	0.76	0.74
Log Loss	0.18	0.49	0.53

Additional details:

Training set: 1.2M users, churn rate 14%
Validation set: 150K users, churn rate 15%
Test set: 150K users, churn rate 15%
Model depth: 12, trees: 800, min samples per leaf: 2
Features: 220 behavioral, billing, and support features

The Problem

The VP of Growth wants to know whether the issue is truly overfitting, how to confirm it systematically, and what changes should be made before the next model release.

Requirements

Explain whether the metric pattern is consistent with high variance and why.
Describe a step-by-step diagnostic plan to confirm the root cause.
Identify which additional slices, plots, or validation checks you would run.
Recommend specific model or data changes to reduce variance.
Explain how you would measure whether the revised model is actually better.

Constraints

Retention team can contact at most 25,000 users per week.
False positives create discount cost and customer annoyance.
Retraining must fit within a weekly batch pipeline.

Context

Current Performance

Metric	Training	Validation	Test
Accuracy	0.94	0.81	0.80
Precision	0.91	0.68	0.66
Recall	0.88	0.57	0.55
F1 Score	0.89	0.62	0.60
AUC-ROC	0.97	0.76	0.74
Log Loss	0.18	0.49	0.53

Additional details:

Training set: 1.2M users, churn rate 14%
Validation set: 150K users, churn rate 15%
Test set: 150K users, churn rate 15%
Model depth: 12, trees: 800, min samples per leaf: 2
Features: 220 behavioral, billing, and support features

The Problem

The VP of Growth wants to know whether the issue is truly overfitting, how to confirm it systematically, and what changes should be made before the next model release.

Requirements

Explain whether the metric pattern is consistent with high variance and why.
Describe a step-by-step diagnostic plan to confirm the root cause.
Identify which additional slices, plots, or validation checks you would run.
Recommend specific model or data changes to reduce variance.
Explain how you would measure whether the revised model is actually better.

Constraints

Retention team can contact at most 25,000 users per week.
False positives create discount cost and customer annoyance.
Retraining must fit within a weekly batch pipeline.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Diagnose High-Variance Churn Model

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Diagnose High-Variance Churn Model

Context

Current Performance

The Problem

Requirements

Constraints

Diagnose High-Variance Churn Model

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer