HelpHive uses a multiclass text classifier to route incoming customer support tickets into four queues: Billing, Technical, Shipping, and Account Access. The team launched the model to reduce manual triage time, but operations leaders noticed that some urgent Account Access tickets are still being misrouted, creating customer delays.
| Metric | Overall Value |
|---|---|
| Accuracy | 0.86 |
| Macro Precision | 0.78 |
| Macro Recall | 0.69 |
| Macro F1 | 0.72 |
| Weighted F1 | 0.84 |
| Avg. confidence on wrong predictions | 0.81 |
| Class | Support Share | Precision | Recall | F1 |
|---|---|---|---|---|
| Billing | 45% | 0.91 | 0.94 | 0.92 |
| Technical | 30% | 0.84 | 0.80 | 0.82 |
| Shipping | 15% | 0.76 | 0.71 | 0.73 |
| Account Access | 10% | 0.60 | 0.31 | 0.41 |
The VP of Support is concerned that overall accuracy looks strong, but the model may be failing on the most operationally sensitive class. You need to determine which metrics best reflect system quality and whether the model is acceptable for production use.