ShopLens is building a binary classification model to predict whether a user will purchase within 7 days after viewing a product. A gradient boosting model was trained on 1.2M sessions from Jan-Jun 2024 and is being considered for deployment to drive retargeting spend. The team is concerned that strong validation results may not hold on truly unseen traffic.
| Metric | Train | 5-Fold CV Mean | Holdout Test (Jul 2024) | Recent Out-of-Time Test (Aug 2024) |
|---|---|---|---|---|
| Accuracy | 0.91 | 0.86 | 0.84 | 0.79 |
| Precision | 0.74 | 0.66 | 0.61 | 0.52 |
| Recall | 0.69 | 0.58 | 0.54 | 0.41 |
| F1 Score | 0.71 | 0.62 | 0.57 | 0.46 |
| AUC-ROC | 0.93 | 0.85 | 0.82 | 0.76 |
| Positive Rate | 0.18 | 0.18 | 0.16 | 0.11 |
Performance degrades consistently from train to cross-validation to holdout to the most recent out-of-time test. Marketing wants to launch the model next month, but the data science lead wants evidence that it will generalize to unseen data and remain reliable under changing traffic patterns.