Design Cross-Validation Evaluation Plan

Context

ShopLens is building a binary classification model to predict whether a customer will purchase within 7 days after viewing a product. A logistic regression baseline was trained on 120,000 sessions, but the team only evaluated it on a single random train/test split and is unsure whether the reported performance is stable enough to trust.

Current Performance

Metric	Single Holdout Result
Accuracy	0.84
Precision	0.61
Recall	0.38
F1 Score	0.47
AUC-ROC	0.79
Positive Class Rate	0.14

The Problem

The product team wants a more reliable estimate of model performance before launch. Because the positive class is relatively rare and customer behavior varies by traffic source and week, a single split may be giving an overly optimistic or unstable view of performance. You need to explain how you would implement cross-validation for this model evaluation and how you would use the results to decide whether the model is ready.

Requirements

Describe which cross-validation strategy you would use and why.
Explain how you would prevent data leakage during cross-validation.
Specify which metrics you would compute across folds and how you would summarize them.
Discuss how you would handle class imbalance when creating folds.
Explain how cross-validation results should influence model selection and threshold decisions.

Constraints

Model training must finish within 2 hours.
Sessions from the same user should not appear in both train and validation folds.
The business prefers stable recall over small gains in accuracy.
The team may later compare logistic regression with gradient boosting using the same evaluation framework.

Context

Current Performance

Metric	Single Holdout Result
Accuracy	0.84
Precision	0.61
Recall	0.38
F1 Score	0.47
AUC-ROC	0.79
Positive Class Rate	0.14

The Problem

Requirements

Describe which cross-validation strategy you would use and why.
Explain how you would prevent data leakage during cross-validation.
Specify which metrics you would compute across folds and how you would summarize them.
Discuss how you would handle class imbalance when creating folds.
Explain how cross-validation results should influence model selection and threshold decisions.

Constraints

Model training must finish within 2 hours.
Sessions from the same user should not appear in both train and validation folds.
The business prefers stable recall over small gains in accuracy.
The team may later compare logistic regression with gradient boosting using the same evaluation framework.

Context

Current Performance

Metric	Single Holdout Result
Accuracy	0.84
Precision	0.61
Recall	0.38
F1 Score	0.47
AUC-ROC	0.79
Positive Class Rate	0.14

The Problem

Requirements

Describe which cross-validation strategy you would use and why.
Explain how you would prevent data leakage during cross-validation.
Specify which metrics you would compute across folds and how you would summarize them.
Discuss how you would handle class imbalance when creating folds.
Explain how cross-validation results should influence model selection and threshold decisions.

Constraints

Model training must finish within 2 hours.
Sessions from the same user should not appear in both train and validation folds.
The business prefers stable recall over small gains in accuracy.
The team may later compare logistic regression with gradient boosting using the same evaluation framework.

Context

Current Performance

Metric	Single Holdout Result
Accuracy	0.84
Precision	0.61
Recall	0.38
F1 Score	0.47
AUC-ROC	0.79
Positive Class Rate	0.14

The Problem

Requirements

Describe which cross-validation strategy you would use and why.
Explain how you would prevent data leakage during cross-validation.
Specify which metrics you would compute across folds and how you would summarize them.
Discuss how you would handle class imbalance when creating folds.
Explain how cross-validation results should influence model selection and threshold decisions.

Constraints

Model training must finish within 2 hours.
Sessions from the same user should not appear in both train and validation folds.
The business prefers stable recall over small gains in accuracy.
The team may later compare logistic regression with gradient boosting using the same evaluation framework.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Design Cross-Validation Evaluation Plan

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Design Cross-Validation Evaluation Plan

Context

Current Performance

The Problem

Requirements

Constraints

Design Cross-Validation Evaluation Plan

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer