NorthStar Bank wants a credit risk model to predict whether a personal loan applicant will default within 12 months. The risk team needs a feature selection process that improves model performance, reduces overfitting, and keeps the final model explainable for regulatory review.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, employment_status, region, years_at_address |
| Financial profile | 10 | annual_income, debt_to_income, revolving_utilization, existing_loans |
| Credit history | 8 | fico_band, delinquencies_12m, inquiries_6m, oldest_trade_age |
| Application details | 6 | loan_amount, loan_purpose, term_months, channel |
| Derived behavioral features | 10 | income_to_loan_ratio, utilization_trend, payment_to_income |
A good solution should improve generalization over a no-selection baseline, achieve stable validation performance, and produce a reduced feature set that risk analysts can review. The final model should use no more than 15 features while maintaining strong recall on defaulters.