Compare Credit Risk Model Limits

Business Context

NorthStar Bank wants a default-risk model for unsecured personal loans. The risk team does not want the single highest-scoring model by default; they want a solution that is accurate, stable, and explicit about where each model fails.

Dataset

You are given a historical loan-origination dataset for model comparison and selection.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_status, residential_status
Financial attributes	9	annual_income, debt_to_income, revolving_utilization, delinquencies_12m
Credit history	7	credit_score, oldest_trade_age, inquiries_6m, open_accounts
Loan attributes	5	loan_amount, term_months, interest_rate, channel
Derived flags	4	thin_file_flag, recent_missed_payment, high_utilization_flag

Size: 120,000 loans, 31 input features
Target: Binary label indicating whether the customer defaulted within 12 months of origination
Class balance: 14% default, 86% non-default
Missing data: 8% missing in employment-related fields, 5% missing in bureau variables for thin-file applicants

Success Criteria

A good solution should achieve ROC-AUC >= 0.82, PR-AUC >= 0.45, and recall >= 0.70 at precision >= 0.40 on the held-out test set. The candidate must also explain why a simpler model may be preferred over a more complex one in production.

Constraints

Decisions must be explainable to the bank's risk analysts and compliance reviewers
Batch scoring must finish in under 10 minutes for 500,000 applications
Retraining is allowed monthly, not daily
The final recommendation must discuss model limitations, not just performance

Deliverables

Train and compare at least three models: logistic regression, decision tree, and random forest or gradient boosting.
Build a preprocessing pipeline for missing values, categorical encoding, and scaling where needed.
Evaluate models with appropriate classification metrics and threshold analysis.
Explain the limitations of each model and recommend one for production.
Describe how you would monitor model drift and degradation after deployment.

Business Context

Dataset

You are given a historical loan-origination dataset for model comparison and selection.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_status, residential_status
Financial attributes	9	annual_income, debt_to_income, revolving_utilization, delinquencies_12m
Credit history	7	credit_score, oldest_trade_age, inquiries_6m, open_accounts
Loan attributes	5	loan_amount, term_months, interest_rate, channel
Derived flags	4	thin_file_flag, recent_missed_payment, high_utilization_flag

Size: 120,000 loans, 31 input features
Target: Binary label indicating whether the customer defaulted within 12 months of origination
Class balance: 14% default, 86% non-default
Missing data: 8% missing in employment-related fields, 5% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Decisions must be explainable to the bank's risk analysts and compliance reviewers
Batch scoring must finish in under 10 minutes for 500,000 applications
Retraining is allowed monthly, not daily
The final recommendation must discuss model limitations, not just performance

Deliverables

Train and compare at least three models: logistic regression, decision tree, and random forest or gradient boosting.
Build a preprocessing pipeline for missing values, categorical encoding, and scaling where needed.
Evaluate models with appropriate classification metrics and threshold analysis.
Explain the limitations of each model and recommend one for production.
Describe how you would monitor model drift and degradation after deployment.

Business Context

Dataset

You are given a historical loan-origination dataset for model comparison and selection.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_status, residential_status
Financial attributes	9	annual_income, debt_to_income, revolving_utilization, delinquencies_12m
Credit history	7	credit_score, oldest_trade_age, inquiries_6m, open_accounts
Loan attributes	5	loan_amount, term_months, interest_rate, channel
Derived flags	4	thin_file_flag, recent_missed_payment, high_utilization_flag

Size: 120,000 loans, 31 input features
Target: Binary label indicating whether the customer defaulted within 12 months of origination
Class balance: 14% default, 86% non-default
Missing data: 8% missing in employment-related fields, 5% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Decisions must be explainable to the bank's risk analysts and compliance reviewers
Batch scoring must finish in under 10 minutes for 500,000 applications
Retraining is allowed monthly, not daily
The final recommendation must discuss model limitations, not just performance

Deliverables

Train and compare at least three models: logistic regression, decision tree, and random forest or gradient boosting.
Build a preprocessing pipeline for missing values, categorical encoding, and scaling where needed.
Evaluate models with appropriate classification metrics and threshold analysis.
Explain the limitations of each model and recommend one for production.
Describe how you would monitor model drift and degradation after deployment.

Business Context

Dataset

You are given a historical loan-origination dataset for model comparison and selection.

Feature Group	Count	Examples
Applicant demographics	6	age, employment_status, residential_status
Financial attributes	9	annual_income, debt_to_income, revolving_utilization, delinquencies_12m
Credit history	7	credit_score, oldest_trade_age, inquiries_6m, open_accounts
Loan attributes	5	loan_amount, term_months, interest_rate, channel
Derived flags	4	thin_file_flag, recent_missed_payment, high_utilization_flag

Size: 120,000 loans, 31 input features
Target: Binary label indicating whether the customer defaulted within 12 months of origination
Class balance: 14% default, 86% non-default
Missing data: 8% missing in employment-related fields, 5% missing in bureau variables for thin-file applicants

Success Criteria

Constraints

Decisions must be explainable to the bank's risk analysts and compliance reviewers
Batch scoring must finish in under 10 minutes for 500,000 applications
Retraining is allowed monthly, not daily
The final recommendation must discuss model limitations, not just performance

Deliverables

Train and compare at least three models: logistic regression, decision tree, and random forest or gradient boosting.
Build a preprocessing pipeline for missing values, categorical encoding, and scaling where needed.
Evaluate models with appropriate classification metrics and threshold analysis.
Explain the limitations of each model and recommend one for production.
Describe how you would monitor model drift and degradation after deployment.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Compare Credit Risk Model Limits

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Compare Credit Risk Model Limits

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Compare Credit Risk Model Limits

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer