NorthBridge Finance issues unsecured personal loans to roughly 120,000 applicants per month. The risk team wants a production-ready model that predicts whether an applicant will default within 12 months so underwriting can reduce losses without rejecting too many good customers.
You are given a historical loan-origination dataset built at application time only. Do not use post-loan repayment behavior as features.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, region, housing_status, dependents |
| Credit bureau variables | 14 | bureau_score, delinquencies_12m, credit_utilization, inquiries_6m |
| Income & employment | 9 | annual_income, employment_length, employer_type, income_to_debt_ratio |
| Loan application details | 8 | loan_amount, term_months, interest_rate, purpose, channel |
| Engineered temporal signals | 5 | applications_last_30d, days_since_last_inquiry, bureau_file_age |
default_12m = 1 if the borrower becomes 90+ days past due within 12 months, else 0A good solution should achieve strong ranking performance and support threshold-based underwriting decisions. Target ROC-AUC >= 0.84, PR-AUC >= 0.42, and recall >= 0.70 at precision >= 0.35 on a held-out out-of-time test set.
default_12m