LendWise, a mid-size digital lender processing about 250K personal loan applications per year, wants a predictive model to estimate whether an applicant will default within 12 months of origination. The model will support underwriting decisions and risk-based pricing, so it must be accurate, explainable, and stable in production.
You are given a historical loan-origination dataset with one row per funded loan.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, employment_status, residence_type |
| Financial attributes | 10 | annual_income, debt_to_income, revolving_utilization |
| Credit history | 9 | fico_band, delinquencies_2y, inquiries_6m |
| Loan attributes | 7 | loan_amount, term_months, interest_rate, purpose |
| Behavioral / derived fields | 6 | income_to_loan_ratio, recent_credit_velocity |
default_12m — whether the borrower defaulted within 12 monthsA good solution should achieve strong ranking performance and usable recall for the risk team. Target performance is ROC-AUC >= 0.82, PR-AUC >= 0.42, and recall >= 0.70 at precision >= 0.35 on a held-out test set.