StreamWave uses a binary classification model to predict whether a subscriber will churn in the next 30 days so the retention team can send discount offers. A logistic regression model was recently deployed, but leadership is concerned that the model looks strong on overall accuracy while still missing too many churners.
| Metric | Validation Set | Previous Baseline | Change |
|---|---|---|---|
| Accuracy | 0.91 | 0.88 | +0.03 |
| Precision | 0.68 | 0.55 | +0.13 |
| Recall | 0.42 | 0.61 | -0.19 |
| F1 Score | 0.52 | 0.58 | -0.06 |
| AUC-ROC | 0.84 | 0.79 | +0.05 |
| Churn rate | 0.12 | 0.12 | 0.00 |
| Customers flagged for retention | 7,400 | 11,900 | -4,500 |
The model is more selective than the previous baseline and generates fewer retention offers, but it is also missing a large share of customers who actually churn. The retention team wants to know whether this model is truly better and how to evaluate it beyond a single metric.