LendWise has built a binary classification model to predict whether a small-business loan applicant will default within 12 months. The model is used to support underwriting decisions, but the risk team is concerned that stakeholders are focusing only on overall accuracy.
The current model was evaluated on a holdout set of 10,000 applications with a 12% default rate. The classification threshold is 0.50, and the model is being compared against the previous scorecard used in production.
| Metric | Previous Scorecard | Current Model |
|---|---|---|
| Accuracy | 0.84 | 0.89 |
| Precision | 0.58 | 0.74 |
| Recall | 0.42 | 0.50 |
| F1 Score | 0.49 | 0.60 |
| AUC-ROC | 0.76 | 0.85 |
| Log Loss | 0.41 | 0.29 |
| Confusion Matrix Count | Current Model | |
| --- | ---: | |
| True Positives | 600 | |
| False Positives | 210 | |
| False Negatives | 600 | |
| True Negatives | 8,590 |
The VP of Credit asks, "If accuracy is 89%, is the model good enough to launch?" You need to explain which metrics matter most, what the current results imply, and whether the threshold or evaluation approach should change.