LendPro, a digital consumer lending platform processing roughly 300K applications per quarter, wants to reduce default losses without materially lowering approval volume. Build a model that predicts whether a newly issued loan will default within 12 months so the risk team can improve underwriting decisions.
You are given a historical loan-origination dataset with one row per funded loan.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant profile | 12 | age, employment_length, annual_income, housing_status |
| Credit bureau | 14 | fico_score, delinquencies_2y, revolving_utilization, inquiries_6m |
| Loan terms | 8 | loan_amount, interest_rate, term_months, purpose |
| Behavioral / banking | 10 | avg_balance_90d, overdraft_count_6m, payroll_detected |
| Derived temporal | 6 | days_since_last_delinquency, income_to_loan_ratio, recent_inquiry_trend |
A strong solution should outperform a logistic regression baseline and achieve enough ranking quality for underwriting. Target ROC-AUC >= 0.82, PR-AUC >= 0.38, and recall >= 0.70 at precision >= 0.45 on the held-out test set.