LendWise, a mid-size digital lender processing ~120K personal loan applications per month, wants a binary classifier to predict whether an approved applicant will default within 12 months. The risk team currently uses a Logistic Regression scorecard, but recent analysis suggests nonlinear interactions between credit behavior, income stability, and prior delinquencies are being missed.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, employment_type, region, dependents |
| Credit bureau features | 14 | credit_score, utilization_ratio, delinquency_count_12m, oldest_trade_age |
| Banking & income | 9 | monthly_income, salary_variance_6m, avg_balance_90d |
| Loan application | 7 | loan_amount, tenure_months, interest_rate, purpose |
| Derived behavior | 6 | dti_ratio, inquiries_per_open_trade, utilization_trend_3m |
default_12m — 1 if the borrower defaults within 12 months, else 0A solution is considered good enough if it improves minority-class detection over Logistic Regression while maintaining operational precision: ROC-AUC >= 0.82, PR-AUC >= 0.42, and recall >= 0.70 at precision >= 0.45 on a holdout test set.