Fix Underfitting in Loan Default

Business Context

LendWise, a digital consumer lending platform processing about 120K loan applications per month, has deployed a simple baseline model to predict 90-day loan default. The current model performs poorly on both training and validation data, and the risk team suspects underfitting. Your task is to diagnose the issue and improve model capacity without creating an overfitted solution.

Dataset

The training data contains one row per funded loan application.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_length, home_ownership, region
Credit and bureau signals	9	fico_score, delinquencies_2y, revolving_utilization, inquiries_6m
Financials	8	annual_income, debt_to_income, monthly_obligations, loan_amount
Loan attributes	5	term_months, interest_rate, purpose, channel
Behavioral aggregates	4	prior_loans_count, prior_default_flag, avg_days_late, autopay_enrolled

Size: 240K loans, 32 input features
Target: Whether the borrower defaults within 90 days of origination
Class balance: 14% default, 86% non-default
Missing data: ~7% missing in employment and income-related fields; ~3% missing in bureau variables for thin-file applicants

Success Criteria

A good solution should clearly identify signs of underfitting and improve model performance materially over a weak baseline. Target at least AUC-ROC >= 0.78 and F1 >= 0.48 on the holdout set, while maintaining stable train/validation performance.

Constraints

Predictions are used in a credit policy workflow, so the model must remain reasonably interpretable.
Batch scoring must complete in under 10 minutes for 120K applications.
Retraining happens monthly, so the pipeline should be simple to maintain.

Deliverables

Diagnose whether the baseline model is underfitting using train vs. validation metrics.
Build an improved classification pipeline and explain why it addresses underfitting.
Compare at least two model families with different capacity levels.
Show feature engineering and hyperparameter tuning choices.
Report final holdout performance and discuss production tradeoffs.

Business Context

Dataset

The training data contains one row per funded loan application.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_length, home_ownership, region
Credit and bureau signals	9	fico_score, delinquencies_2y, revolving_utilization, inquiries_6m
Financials	8	annual_income, debt_to_income, monthly_obligations, loan_amount
Loan attributes	5	term_months, interest_rate, purpose, channel
Behavioral aggregates	4	prior_loans_count, prior_default_flag, avg_days_late, autopay_enrolled

Size: 240K loans, 32 input features
Target: Whether the borrower defaults within 90 days of origination
Class balance: 14% default, 86% non-default
Missing data: ~7% missing in employment and income-related fields; ~3% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Predictions are used in a credit policy workflow, so the model must remain reasonably interpretable.
Batch scoring must complete in under 10 minutes for 120K applications.
Retraining happens monthly, so the pipeline should be simple to maintain.

Deliverables

Diagnose whether the baseline model is underfitting using train vs. validation metrics.
Build an improved classification pipeline and explain why it addresses underfitting.
Compare at least two model families with different capacity levels.
Show feature engineering and hyperparameter tuning choices.
Report final holdout performance and discuss production tradeoffs.

Business Context

Dataset

The training data contains one row per funded loan application.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_length, home_ownership, region
Credit and bureau signals	9	fico_score, delinquencies_2y, revolving_utilization, inquiries_6m
Financials	8	annual_income, debt_to_income, monthly_obligations, loan_amount
Loan attributes	5	term_months, interest_rate, purpose, channel
Behavioral aggregates	4	prior_loans_count, prior_default_flag, avg_days_late, autopay_enrolled

Size: 240K loans, 32 input features
Target: Whether the borrower defaults within 90 days of origination
Class balance: 14% default, 86% non-default
Missing data: ~7% missing in employment and income-related fields; ~3% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Predictions are used in a credit policy workflow, so the model must remain reasonably interpretable.
Batch scoring must complete in under 10 minutes for 120K applications.
Retraining happens monthly, so the pipeline should be simple to maintain.

Deliverables

Diagnose whether the baseline model is underfitting using train vs. validation metrics.
Build an improved classification pipeline and explain why it addresses underfitting.
Compare at least two model families with different capacity levels.
Show feature engineering and hyperparameter tuning choices.
Report final holdout performance and discuss production tradeoffs.

Business Context

Dataset

The training data contains one row per funded loan application.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_length, home_ownership, region
Credit and bureau signals	9	fico_score, delinquencies_2y, revolving_utilization, inquiries_6m
Financials	8	annual_income, debt_to_income, monthly_obligations, loan_amount
Loan attributes	5	term_months, interest_rate, purpose, channel
Behavioral aggregates	4	prior_loans_count, prior_default_flag, avg_days_late, autopay_enrolled

Size: 240K loans, 32 input features
Target: Whether the borrower defaults within 90 days of origination
Class balance: 14% default, 86% non-default
Missing data: ~7% missing in employment and income-related fields; ~3% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Predictions are used in a credit policy workflow, so the model must remain reasonably interpretable.
Batch scoring must complete in under 10 minutes for 120K applications.
Retraining happens monthly, so the pipeline should be simple to maintain.

Deliverables

Diagnose whether the baseline model is underfitting using train vs. validation metrics.
Build an improved classification pipeline and explain why it addresses underfitting.
Compare at least two model families with different capacity levels.
Show feature engineering and hyperparameter tuning choices.
Report final holdout performance and discuss production tradeoffs.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Fix Underfitting in Loan Default

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Fix Underfitting in Loan Default

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Fix Underfitting in Loan Default

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer