LendWise built a binary classification model to predict whether a personal loan applicant will default within 12 months. The model is scheduled for deployment to support underwriting decisions, but the risk team wants a clear validation plan before it goes live.
A holdout test set of 20,000 recent applications was evaluated after training on 180,000 historical applications. Default prevalence in the test set is 10%, and false negatives are more costly than false positives because missed defaulters create direct credit losses.
| Metric | Validation Set | Test Set |
|---|---|---|
| Accuracy | 0.91 | 0.89 |
| Precision | 0.68 | 0.61 |
| Recall | 0.74 | 0.55 |
| F1 Score | 0.71 | 0.58 |
| AUC-ROC | 0.87 | 0.84 |
| Log Loss | 0.29 | 0.36 |
The model looks strong on accuracy, but recall drops materially on the unseen test set. The underwriting team is concerned that the model may be overestimating readiness for production and missing too many risky borrowers.