NorthBridge Capital is a mid-sized consumer lender processing roughly 120,000 personal loan applications per month. The credit risk team wants a production-ready default prediction model to support underwriting decisions and expected loss forecasting.
You are given an offline training dataset of historical funded loans. Each row represents one originated loan with borrower attributes captured at approval time and repayment outcomes observed over 12 months.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant financials | 12 | annual_income, debt_to_income, revolving_utilization, existing_trade_count |
| Credit bureau signals | 10 | fico_band, delinquencies_2y, inquiries_6m, public_records |
| Loan attributes | 8 | loan_amount, term_months, interest_rate, purpose, channel |
| Behavioral / banking | 6 | avg_balance_90d, nsf_events_6m, payroll_deposit_flag, cashflow_volatility |
| Derived temporal features | 4 | months_since_last_delinquency, credit_history_length, recent_balance_trend |
default_12m = 1 if the loan becomes 90+ days past due or charged off within 12 monthsA good solution should outperform a regularized logistic regression baseline and be suitable for underwriting review. Target ROC-AUC >= 0.82, PR-AUC >= 0.38, and recall >= 0.70 at precision >= 0.30 on the holdout period.