AcmeCloud uses a binary classification model to route incoming support tickets as urgent or non-urgent. The model is a logistic regression classifier used in production for 6 weeks, but support managers report that too many urgent tickets are still reaching the standard queue.
| Metric | Validation Set | Current Production Holdout |
|---|---|---|
| Accuracy | 0.91 | 0.89 |
| Precision | 0.78 | 0.81 |
| Recall | 0.74 | 0.58 |
| F1 Score | 0.76 | 0.68 |
| AUC-ROC | 0.87 | 0.84 |
| Urgent ticket rate | 18% | 17% |
Leadership sees high overall accuracy and assumes the model is performing well, but the operations team is concerned because many truly urgent tickets are not being escalated. You need to assess whether the model is actually effective for the business goal and recommend how to improve evaluation and model performance.