ShieldPay runs a card-not-present fraud detection model that scores transactions in real time and sends high-risk cases to a manual review queue. A large enterprise customer says the platform is "missing the fraud we expected it to catch," even though the model still looks strong on aggregate ranking metrics.
| Metric | Last Quarter | Current | Change |
|---|---|---|---|
| Precision | 0.76 | 0.88 | +0.12 |
| Recall | 0.81 | 0.58 | -0.23 |
| F1 Score | 0.78 | 0.70 | -0.08 |
| AUC-ROC | 0.93 | 0.92 | -0.01 |
| PR-AUC | 0.41 | 0.36 | -0.05 |
| Fraud review rate | 1.9% | 1.1% | -0.8 pts |
| Monthly fraud loss at customer | $420K | $690K | +64% |
The model threshold was raised from 0.42 to 0.63 six weeks ago to reduce analyst workload. The customer’s fraud base rate also increased from 0.35% to 0.52% after expansion into cross-border transactions.
You need to determine whether the issue is thresholding, calibration drift, segment-specific underperformance, or a broader model quality problem. The customer wants a clear explanation for why fewer fraud cases are being caught despite strong AUC.