BrightCart uses a binary classification model to predict which users are likely to respond to a paid email promotion for a premium subscription upgrade. The marketing team is encouraged by the model's AUC-ROC, but campaign ROI has been inconsistent because the final audience is selected using a fixed score threshold.
| Metric | Validation Set | Previous Model |
|---|---|---|
| AUC-ROC | 0.84 | 0.76 |
| Precision @ threshold 0.50 | 0.22 | 0.19 |
| Recall @ threshold 0.50 | 0.61 | 0.54 |
| F1 Score @ threshold 0.50 | 0.32 | 0.28 |
| Log Loss | 0.49 | 0.58 |
| Conversion rate in population | 0.08 | 0.08 |
| Top-decile response rate | 0.24 | 0.18 |
| Expected campaign profit / 100k users | $18,000 | $11,000 |
The VP of Marketing asks whether an AUC-ROC of 0.84 means the model is "good enough" for rollout across all campaigns. The team wants to understand what this score actually says about ranking quality, what it does not guarantee, and whether threshold or calibration issues could still hurt business outcomes.