Diagnose Overfitting vs Underfitting

Context

Data Society is evaluating a binary classification model that predicts whether a learner will complete a course within 30 days. A junior team member trained a gradient boosted tree model and reported strong training performance, but stakeholders are concerned that the model may not generalize well in production.

Current Performance

Metric	Training Set	Validation Set	Test Set
Accuracy	0.96	0.81	0.80
Precision	0.95	0.78	0.77
Recall	0.94	0.69	0.68
F1 Score	0.95	0.73	0.72
AUC-ROC	0.98	0.84	0.83
Log Loss	0.11	0.46	0.49

Model variant B (simpler logistic regression baseline) achieved: training accuracy 0.78, validation accuracy 0.77, test accuracy 0.77, validation F1 0.74, and test F1 0.74.

The Problem

You need to determine whether the gradient boosted model is overfitting or underfitting, explain how the metrics support that conclusion, and recommend what Data Society should do next.

Requirements

Diagnose whether the current model is overfitting, underfitting, or neither.
Use the train/validation/test metrics to justify your conclusion.
Compare the current model to the simpler baseline and explain which model you would prefer.
Recommend specific changes to improve generalization.
Explain what additional validation checks you would run before deployment.

Constraints

The model must be explainable enough for internal review.
Retraining can happen weekly, not daily.
False negatives are more costly than false positives because missed at-risk learners reduce intervention opportunities.

Context

Current Performance

Metric	Training Set	Validation Set	Test Set
Accuracy	0.96	0.81	0.80
Precision	0.95	0.78	0.77
Recall	0.94	0.69	0.68
F1 Score	0.95	0.73	0.72
AUC-ROC	0.98	0.84	0.83
Log Loss	0.11	0.46	0.49

Model variant B (simpler logistic regression baseline) achieved: training accuracy 0.78, validation accuracy 0.77, test accuracy 0.77, validation F1 0.74, and test F1 0.74.

The Problem

You need to determine whether the gradient boosted model is overfitting or underfitting, explain how the metrics support that conclusion, and recommend what Data Society should do next.

Requirements

Diagnose whether the current model is overfitting, underfitting, or neither.
Use the train/validation/test metrics to justify your conclusion.
Compare the current model to the simpler baseline and explain which model you would prefer.
Recommend specific changes to improve generalization.
Explain what additional validation checks you would run before deployment.

Constraints

The model must be explainable enough for internal review.
Retraining can happen weekly, not daily.
False negatives are more costly than false positives because missed at-risk learners reduce intervention opportunities.

Context

Current Performance

Metric	Training Set	Validation Set	Test Set
Accuracy	0.96	0.81	0.80
Precision	0.95	0.78	0.77
Recall	0.94	0.69	0.68
F1 Score	0.95	0.73	0.72
AUC-ROC	0.98	0.84	0.83
Log Loss	0.11	0.46	0.49

Model variant B (simpler logistic regression baseline) achieved: training accuracy 0.78, validation accuracy 0.77, test accuracy 0.77, validation F1 0.74, and test F1 0.74.

The Problem

You need to determine whether the gradient boosted model is overfitting or underfitting, explain how the metrics support that conclusion, and recommend what Data Society should do next.

Requirements

Diagnose whether the current model is overfitting, underfitting, or neither.
Use the train/validation/test metrics to justify your conclusion.
Compare the current model to the simpler baseline and explain which model you would prefer.
Recommend specific changes to improve generalization.
Explain what additional validation checks you would run before deployment.

Constraints

The model must be explainable enough for internal review.
Retraining can happen weekly, not daily.
False negatives are more costly than false positives because missed at-risk learners reduce intervention opportunities.

Context

Current Performance

Metric	Training Set	Validation Set	Test Set
Accuracy	0.96	0.81	0.80
Precision	0.95	0.78	0.77
Recall	0.94	0.69	0.68
F1 Score	0.95	0.73	0.72
AUC-ROC	0.98	0.84	0.83
Log Loss	0.11	0.46	0.49

Model variant B (simpler logistic regression baseline) achieved: training accuracy 0.78, validation accuracy 0.77, test accuracy 0.77, validation F1 0.74, and test F1 0.74.

The Problem

You need to determine whether the gradient boosted model is overfitting or underfitting, explain how the metrics support that conclusion, and recommend what Data Society should do next.

Requirements

Diagnose whether the current model is overfitting, underfitting, or neither.
Use the train/validation/test metrics to justify your conclusion.
Compare the current model to the simpler baseline and explain which model you would prefer.
Recommend specific changes to improve generalization.
Explain what additional validation checks you would run before deployment.

Constraints

The model must be explainable enough for internal review.
Retraining can happen weekly, not daily.
False negatives are more costly than false positives because missed at-risk learners reduce intervention opportunities.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Diagnose Overfitting vs Underfitting

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Diagnose Overfitting vs Underfitting

Context

Current Performance

The Problem

Requirements

Constraints

Diagnose Overfitting vs Underfitting

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer