Business Context
NorthStar Bank wants a model to predict whether a personal loan applicant will default within 12 months. The risk team needs a solution that balances predictive performance with strict interpretability and low-latency batch scoring for daily underwriting.
Dataset
| Feature Group | Count | Examples |
|---|
| Applicant demographics | 6 | age, region, employment_status |
| Financial profile | 10 | annual_income, debt_to_income, existing_loans, credit_utilization |
| Credit history | 8 | delinquency_count_12m, credit_age_months, inquiries_6m |
| Loan application | 5 | loan_amount, term_months, interest_rate, purpose |
| Behavioral / derived | 7 | income_to_loan_ratio, revolving_balance_trend, recent_missed_payment_flag |
- Size: 120K historical loan applications, 36 features
- Target: Binary — default within 12 months (1) vs no default (0)
- Class balance: 14% positive, 86% negative
- Missing data: 9% missing in employment and income fields, 4% missing in credit bureau attributes
Success Criteria
A good solution should achieve strong ranking performance while remaining explainable enough for compliance review. Target at least AUC-ROC >= 0.82, F1 >= 0.58, and recall >= 0.70 at an operational threshold chosen with business costs in mind.
Constraints
- Predictions must be explainable to risk analysts and auditors
- Batch inference for 20K applications/day should complete in under 5 minutes
- Training must fit on a standard CPU machine; no large-scale GPU infrastructure
- The bank prefers stable models over small gains from highly complex methods
Deliverables
- Compare at least three candidate models for this classification problem
- Justify which model you would choose under the stated constraints
- Build a reproducible training and evaluation pipeline
- Show how you handle missing values, categorical variables, and class imbalance
- Recommend an operating threshold and explain the tradeoff between performance and interpretability