Debug Training to Production Gap

Scenario

You've shipped a model that looked strong in training and validation, but it is underperforming in production. The team wants you to figure out why the offline results did not hold up after deployment.

Question

How would you troubleshoot a model that appears to work in training but fails in production?

Problem

Scenario

Question

How would you troubleshoot a model that appears to work in training but fails in production?

Observed Signals

ECE·0.11Production Recall·0.51Production AUC-ROC·0.74Validation AUC-ROC·0.91Production Precision·0.67Features with PSI > 0.2·4 of 22

Problem

Scenario

Question

How would you troubleshoot a model that appears to work in training but fails in production?

Observed Signals

ECE·0.11Production Recall·0.51Production AUC-ROC·0.74Validation AUC-ROC·0.91Production Precision·0.67Features with PSI > 0.2·4 of 22

Problem

Scenario

Question

How would you troubleshoot a model that appears to work in training but fails in production?

Observed Signals

ECE·0.11Production Recall·0.51Production AUC-ROC·0.74Validation AUC-ROC·0.91Production Precision·0.67Features with PSI > 0.2·4 of 22

Interview Guides

Problem

Scenario

Question

Observed Signals

Problem

Scenario

Question

Observed Signals

Debug Training to Production Gap

Problem

Scenario

Question

Observed Signals

Problem

Scenario

Question

Observed Signals