Business Context
NorthStar Bank processes roughly 120,000 personal loan applications per month. The risk team wants a binary classification model to predict whether an applicant will default within 12 months, and they need a clear explanation of why a specific model was chosen over simpler and more complex alternatives.
Dataset
| Feature Group | Count | Examples |
|---|
| Applicant demographics | 6 | age, employment_status, region, dependents |
| Credit history | 9 | credit_score, delinquencies_24m, open_accounts, utilization_rate |
| Financials | 8 | annual_income, debt_to_income, monthly_obligations, savings_balance |
| Loan attributes | 5 | loan_amount, term_months, interest_rate, purpose, secured_flag |
| Behavioral / derived | 4 | recent_hard_inquiries, income_to_loan_ratio, credit_age_years, payment_burden |
- Size: 240,000 historical applications, 32 features
- Target: Binary — default within 12 months (1) vs no default (0)
- Class balance: 14% positive, 86% negative
- Missing data: ~9% missing in savings_balance, ~6% in employment_status, <2% elsewhere
Success Criteria
A good solution should achieve ROC-AUC >= 0.82, PR-AUC >= 0.45, and provide a defensible explanation for model choice that balances predictive performance with interpretability for credit risk review.
Constraints
- Predictions must be generated in <50 ms per application in an online decisioning service.
- The risk team requires global and local feature explanations.
- The model should be retrained monthly and remain stable under moderate drift.
- Regulatory review favors models that can be justified clearly to non-technical stakeholders.
Deliverables
- Train and compare at least two candidate models for default prediction.
- Select one final model and explain why it was chosen using performance, interpretability, and operational constraints.
- Describe preprocessing and feature engineering decisions.
- Evaluate the model on a held-out test set with appropriate classification metrics.
- Provide a concise explanation suitable for a risk manager or compliance reviewer.