Detect Leakage in Feature Engineering

Context

ShopLens built a binary classification model to predict whether a customer will make a repeat purchase within 30 days of their first order. A LightGBM model looked excellent in offline validation, but performance dropped sharply after deployment. The team suspects data leakage introduced during feature engineering.

Current Performance

Metric	Offline Validation	Production Holdout	Change
Accuracy	0.91	0.74	-0.17
Precision	0.88	0.63	-0.25
Recall	0.86	0.52	-0.34
F1 Score	0.87	0.57	-0.30
AUC-ROC	0.95	0.69	-0.26
Log Loss	0.21	0.61	+0.40

The feature set includes customer tenure, average basket size, email engagement, support contacts, and rolling 30-day order aggregates. During review, the team found that some aggregates may have been computed using data extending beyond the prediction timestamp.

The Problem

You need to determine whether the performance gap is caused by leakage, identify which features or validation steps are most suspicious, and recommend how to redesign the feature engineering and evaluation process.

Requirements

Diagnose the most likely leakage mechanisms given the metric pattern.
Identify which features and validation choices should be audited first.
Explain how you would prove leakage using targeted experiments.
Recommend changes to feature generation, train/validation splitting, and monitoring.
Discuss how you would rebuild trust in the model before redeployment.

Constraints

Marketing uses the model weekly for retention campaigns.
False positives waste outreach budget; false negatives miss high-value repeat buyers.
Full retraining and backfill must be completed within 10 days.

Context

Current Performance

Metric	Offline Validation	Production Holdout	Change
Accuracy	0.91	0.74	-0.17
Precision	0.88	0.63	-0.25
Recall	0.86	0.52	-0.34
F1 Score	0.87	0.57	-0.30
AUC-ROC	0.95	0.69	-0.26
Log Loss	0.21	0.61	+0.40

The Problem

Requirements

Diagnose the most likely leakage mechanisms given the metric pattern.
Identify which features and validation choices should be audited first.
Explain how you would prove leakage using targeted experiments.
Recommend changes to feature generation, train/validation splitting, and monitoring.
Discuss how you would rebuild trust in the model before redeployment.

Constraints

Marketing uses the model weekly for retention campaigns.
False positives waste outreach budget; false negatives miss high-value repeat buyers.
Full retraining and backfill must be completed within 10 days.

Context

Current Performance

Metric	Offline Validation	Production Holdout	Change
Accuracy	0.91	0.74	-0.17
Precision	0.88	0.63	-0.25
Recall	0.86	0.52	-0.34
F1 Score	0.87	0.57	-0.30
AUC-ROC	0.95	0.69	-0.26
Log Loss	0.21	0.61	+0.40

The Problem

Requirements

Diagnose the most likely leakage mechanisms given the metric pattern.
Identify which features and validation choices should be audited first.
Explain how you would prove leakage using targeted experiments.
Recommend changes to feature generation, train/validation splitting, and monitoring.
Discuss how you would rebuild trust in the model before redeployment.

Constraints

Marketing uses the model weekly for retention campaigns.
False positives waste outreach budget; false negatives miss high-value repeat buyers.
Full retraining and backfill must be completed within 10 days.

Context

Current Performance

Metric	Offline Validation	Production Holdout	Change
Accuracy	0.91	0.74	-0.17
Precision	0.88	0.63	-0.25
Recall	0.86	0.52	-0.34
F1 Score	0.87	0.57	-0.30
AUC-ROC	0.95	0.69	-0.26
Log Loss	0.21	0.61	+0.40

The Problem

Requirements

Diagnose the most likely leakage mechanisms given the metric pattern.
Identify which features and validation choices should be audited first.
Explain how you would prove leakage using targeted experiments.
Recommend changes to feature generation, train/validation splitting, and monitoring.
Discuss how you would rebuild trust in the model before redeployment.

Constraints

Marketing uses the model weekly for retention campaigns.
False positives waste outreach budget; false negatives miss high-value repeat buyers.
Full retraining and backfill must be completed within 10 days.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Detect Leakage in Feature Engineering

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Detect Leakage in Feature Engineering

Context

Current Performance

The Problem

Requirements

Constraints

Detect Leakage in Feature Engineering

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer