BrightLend, a mid-size consumer lending platform, wants a baseline predictive model to estimate whether a loan applicant will default within 12 months of origination. The goal is to help analysts demonstrate sound predictive modeling fundamentals: problem framing, feature preparation, model training, and evaluation.
You are given a historical loan application dataset with one row per funded loan.
| Feature Group | Count | Examples |
|---|---|---|
| Applicant demographics | 6 | age, employment_status, residence_type |
| Financial attributes | 9 | annual_income, debt_to_income, credit_utilization, existing_loans |
| Loan details | 7 | loan_amount, interest_rate, term_months, purpose |
| Credit history | 8 | credit_score, delinquencies_12m, inquiries_6m, bankruptcies |
| Behavioral / derived | 5 | income_to_loan_ratio, recent_balance_change, payment_burden |
default_12m — whether the borrower defaulted within 12 monthsA good solution should outperform a naive majority-class baseline and produce a model that is understandable enough for risk analysts to review. Target performance is ROC-AUC >= 0.78 and F1 >= 0.45 on a held-out test set, with clear discussion of threshold selection.
default_12m