Ship Readiness for Loan Approval Model

Context

LendWise has trained a binary classification model to pre-approve personal loan applications. The model is intended to reduce manual underwriting volume while keeping default risk within policy limits. A pilot on recent applications shows strong aggregate accuracy, but risk and operations teams disagree on whether the model is ready to ship.

Current Performance

Metric	Validation Set	Pilot Holdout	Ship Target
Accuracy	0.91	0.89	>= 0.88
Precision (approved loans that stay current)	0.93	0.90	>= 0.92
Recall (good borrowers approved)	0.78	0.72	>= 0.75
F1 Score	0.85	0.80	>= 0.83
AUC-ROC	0.94	0.90	>= 0.91
Log Loss	0.21	0.29	<= 0.25
Calibration error	0.03	0.08	<= 0.05
Manual review rate	18%	27%	<= 20%
90-day default rate on approved loans	3.1%	4.8%	<= 4.0%

The Problem

The pilot suggests the model generalizes worse than offline validation, especially on recall, calibration, and downstream default rate. Leadership wants a recommendation on whether the model is good enough to ship now, ship behind guardrails, or hold for improvement.

Requirements

Assess whether the model is ready to ship using the metrics above.
Explain which metrics matter most for this decision and why accuracy alone is insufficient.
Diagnose the likely causes of the validation-to-pilot gap.
Recommend threshold, calibration, and validation changes before launch.
Propose a post-launch monitoring plan with clear rollback criteria.

Constraints

False approvals are costly: average loss per defaulted loan is $4,200.
False rejections reduce revenue and create fairness concerns.
Underwriting can manually review at most 20% of applications.

Context

Current Performance

Metric	Validation Set	Pilot Holdout	Ship Target
Accuracy	0.91	0.89	>= 0.88
Precision (approved loans that stay current)	0.93	0.90	>= 0.92
Recall (good borrowers approved)	0.78	0.72	>= 0.75
F1 Score	0.85	0.80	>= 0.83
AUC-ROC	0.94	0.90	>= 0.91
Log Loss	0.21	0.29	<= 0.25
Calibration error	0.03	0.08	<= 0.05
Manual review rate	18%	27%	<= 20%
90-day default rate on approved loans	3.1%	4.8%	<= 4.0%

The Problem

Requirements

Assess whether the model is ready to ship using the metrics above.
Explain which metrics matter most for this decision and why accuracy alone is insufficient.
Diagnose the likely causes of the validation-to-pilot gap.
Recommend threshold, calibration, and validation changes before launch.
Propose a post-launch monitoring plan with clear rollback criteria.

Constraints

False approvals are costly: average loss per defaulted loan is $4,200.
False rejections reduce revenue and create fairness concerns.
Underwriting can manually review at most 20% of applications.

Context

Current Performance

Metric	Validation Set	Pilot Holdout	Ship Target
Accuracy	0.91	0.89	>= 0.88
Precision (approved loans that stay current)	0.93	0.90	>= 0.92
Recall (good borrowers approved)	0.78	0.72	>= 0.75
F1 Score	0.85	0.80	>= 0.83
AUC-ROC	0.94	0.90	>= 0.91
Log Loss	0.21	0.29	<= 0.25
Calibration error	0.03	0.08	<= 0.05
Manual review rate	18%	27%	<= 20%
90-day default rate on approved loans	3.1%	4.8%	<= 4.0%

The Problem

Requirements

Assess whether the model is ready to ship using the metrics above.
Explain which metrics matter most for this decision and why accuracy alone is insufficient.
Diagnose the likely causes of the validation-to-pilot gap.
Recommend threshold, calibration, and validation changes before launch.
Propose a post-launch monitoring plan with clear rollback criteria.

Constraints

False approvals are costly: average loss per defaulted loan is $4,200.
False rejections reduce revenue and create fairness concerns.
Underwriting can manually review at most 20% of applications.

Context

Current Performance

Metric	Validation Set	Pilot Holdout	Ship Target
Accuracy	0.91	0.89	>= 0.88
Precision (approved loans that stay current)	0.93	0.90	>= 0.92
Recall (good borrowers approved)	0.78	0.72	>= 0.75
F1 Score	0.85	0.80	>= 0.83
AUC-ROC	0.94	0.90	>= 0.91
Log Loss	0.21	0.29	<= 0.25
Calibration error	0.03	0.08	<= 0.05
Manual review rate	18%	27%	<= 20%
90-day default rate on approved loans	3.1%	4.8%	<= 4.0%

The Problem

Requirements

Assess whether the model is ready to ship using the metrics above.
Explain which metrics matter most for this decision and why accuracy alone is insufficient.
Diagnose the likely causes of the validation-to-pilot gap.
Recommend threshold, calibration, and validation changes before launch.
Propose a post-launch monitoring plan with clear rollback criteria.

Constraints

False approvals are costly: average loss per defaulted loan is $4,200.
False rejections reduce revenue and create fairness concerns.
Underwriting can manually review at most 20% of applications.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Ship Readiness for Loan Approval Model

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Ship Readiness for Loan Approval Model

Context

Current Performance

The Problem

Requirements

Constraints

Ship Readiness for Loan Approval Model

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer