NorthStar Lending is a mid-sized digital lender that processes roughly 250,000 personal loan applications per year. The risk team wants a machine learning model to predict whether an approved applicant will default within 12 months so underwriting rules can be improved without materially slowing application review.
You are given a historical loan dataset built from approved applications and repayment outcomes.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, employment_status, residential_status, years_at_address |
| Financial profile | 11 | annual_income, debt_to_income, existing_loans, revolving_utilization |
| Credit history | 9 | credit_score, delinquencies_12m, inquiries_6m, oldest_trade_line_months |
| Loan attributes | 7 | loan_amount, term_months, interest_rate, purpose, channel |
| Behavioral / derived | 5 | income_to_loan_ratio, recent_credit_velocity, utilization_trend |
default_12m — whether the borrower defaulted within 12 months of originationA good solution should outperform a simple logistic regression baseline, achieve strong ranking quality for risk-based decisioning, and provide enough interpretability for the credit policy team to review the main drivers of risk.