Evaluate Credit Risk Model Accuracy

Context

NorthStar Bank deployed a gradient boosting model to predict whether a small-business loan applicant will default within 12 months. The model replaced a logistic regression scorecard and is now used to auto-approve, reject, or route applications for manual underwriting.

After two quarters in production, leadership sees mixed results: default losses are slightly lower, but approval rates fell and underwriters report too many borderline cases being escalated. You need to assess whether the model is actually performing well and how its accuracy should be verified beyond a single headline metric.

Current Performance

Metric	Validation Set	Production (last 60 days)	Baseline Scorecard
Accuracy	0.842	0.801	0.776
Precision	0.691	0.648	0.571
Recall	0.583	0.472	0.514
F1 Score	0.632	0.545	0.541
AUC-ROC	0.861	0.823	0.781
Default Rate	0.182	0.214	0.214
Manual Review Rate	18.0%	27.4%	16.1%

The Problem

The model looks better than the baseline on AUC and accuracy, but production recall dropped materially while manual reviews increased. The bank wants to know whether the model is truly better, what verification steps are missing, and what should be improved before expanding auto-decisioning.

Requirements

Interpret the metric tradeoffs between validation and production.
Explain whether accuracy is a reliable primary metric here.
Identify likely causes of the recall drop and review-rate increase.
Recommend how to verify model accuracy using validation, calibration, and error analysis.
Propose specific changes to improve business performance.

Constraints

False negatives are costly because missed defaults create charge-off losses.
False positives reduce approval volume and revenue.
Regulators require explainability and stable decision thresholds.

Context

Metric

Validation Set

Production (last 60 days)

Baseline Scorecard

Accuracy

0.842

0.801

0.776

Precision

0.691

0.648

0.571

Recall

0.583

0.472

0.514

F1 Score

0.632

0.545

0.541

AUC-ROC

0.861

0.823

0.781

Default Rate

0.182

0.214

Manual Review Rate

18.0%

27.4%

16.1%

Requirements

Interpret the metric tradeoffs between validation and production.

Explain whether accuracy is a reliable primary metric here.

Identify likely causes of the recall drop and review-rate increase.

Recommend how to verify model accuracy using validation, calibration, and error analysis.

Propose specific changes to improve business performance.

Context

Metric

Validation Set

Production (last 60 days)

Baseline Scorecard

Accuracy

0.842

0.801

0.776

Precision

0.691

0.648

0.571

Recall

0.583

0.472

0.514

F1 Score

0.632

0.545

0.541

AUC-ROC

0.861

0.823

0.781

Default Rate

0.182

0.214

Manual Review Rate

18.0%

27.4%

16.1%

Requirements

Interpret the metric tradeoffs between validation and production.

Explain whether accuracy is a reliable primary metric here.

Identify likely causes of the recall drop and review-rate increase.

Recommend how to verify model accuracy using validation, calibration, and error analysis.

Propose specific changes to improve business performance.

Context

Metric

Validation Set

Production (last 60 days)

Baseline Scorecard

Accuracy

0.842

0.801

0.776

Precision

0.691

0.648

0.571

Recall

0.583

0.472

0.514

F1 Score

0.632

0.545

0.541

AUC-ROC

0.861

0.823

0.781

Default Rate

0.182

0.214

Manual Review Rate

18.0%

27.4%

16.1%

Requirements

Interpret the metric tradeoffs between validation and production.

Explain whether accuracy is a reliable primary metric here.

Identify likely causes of the recall drop and review-rate increase.

Recommend how to verify model accuracy using validation, calibration, and error analysis.

Propose specific changes to improve business performance.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Evaluate Credit Risk Model Accuracy

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Evaluate Credit Risk Model Accuracy

Context

Current Performance

The Problem

Requirements

Constraints

Evaluate Credit Risk Model Accuracy

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer