FinShield uses a binary classification model to score card transactions for fraud and sends high-risk transactions to a manual review queue. Over the last month, risk leaders noticed either too many legitimate transactions being reviewed or too many fraudulent transactions slipping through, and they want to know whether the current alert threshold is set too high or too low.
The model outputs a fraud probability, and the production threshold is currently 0.80.
| Metric | Threshold = 0.80 | Threshold = 0.65 | Threshold = 0.50 |
|---|---|---|---|
| Precision | 0.92 | 0.78 | 0.61 |
| Recall | 0.41 | 0.68 | 0.84 |
| F1 Score | 0.57 | 0.73 | 0.71 |
| False Positive Rate | 0.003 | 0.009 | 0.021 |
| Alerts/day | 1,900 | 4,600 | 8,900 |
| True fraud caught/day | 820 | 1,360 | 1,680 |
| Missed fraud/day | 1,180 | 640 | 320 |
| Review capacity/day | 5,000 | 5,000 | 5,000 |
You need to determine whether the current threshold of 0.80 is too conservative or too permissive, quantify the tradeoff, and recommend a better operating point.