Version Data and Models Reliably

Context

ShopLens runs a purchase propensity model that scores 4.2M users daily for email and in-app promotions. Over the last quarter, offline validation remained stable, but online campaign performance became inconsistent after multiple untracked data refreshes and model replacements.

The team currently stores model artifacts in object storage with ad hoc names and overwrites feature tables in place. Leadership wants a production-ready versioning strategy for both data and models that preserves reproducibility, supports rollback, and makes metric changes explainable.

Current Performance

Metric	Model v17 + Data snapshot 2024-05	Model v18 + overwritten data 2024-06	Change
AUC-ROC	0.81	0.80	-0.01
Log Loss	0.46	0.51	+0.05
Precision @ top 10%	0.29	0.24	-0.05
Recall @ top 10%	0.41	0.36	-0.05
Calibration error	0.03	0.09	+0.06
Campaign conversion rate	3.8%	3.1%	-0.7 pp

The Problem

The team cannot determine whether the drop came from model changes, training data changes, feature definition drift, or threshold updates. There is no reliable mapping between a prediction in production and the exact training dataset, feature code, hyperparameters, or evaluation report used to create it.

Requirements

Diagnose what the metrics suggest about the likely failure modes.
Design a versioning approach for raw data, curated features, training datasets, model artifacts, and thresholds.
Specify what metadata must be logged to reproduce any model run and any production prediction batch.
Recommend validation checks before promotion and rollback criteria after deployment.
Explain how your design would support auditability, comparison across versions, and root-cause analysis.

Constraints

Retraining happens weekly and must finish within 6 hours.
Marketing can tolerate at most a 5% relative drop in conversion for one week.
Raw event data cannot be modified after ingestion, but feature tables are currently mutable.

Context

Current Performance

Metric	Model v17 + Data snapshot 2024-05	Model v18 + overwritten data 2024-06	Change
AUC-ROC	0.81	0.80	-0.01
Log Loss	0.46	0.51	+0.05
Precision @ top 10%	0.29	0.24	-0.05
Recall @ top 10%	0.41	0.36	-0.05
Calibration error	0.03	0.09	+0.06
Campaign conversion rate	3.8%	3.1%	-0.7 pp

The Problem

Requirements

Diagnose what the metrics suggest about the likely failure modes.
Design a versioning approach for raw data, curated features, training datasets, model artifacts, and thresholds.
Specify what metadata must be logged to reproduce any model run and any production prediction batch.
Recommend validation checks before promotion and rollback criteria after deployment.
Explain how your design would support auditability, comparison across versions, and root-cause analysis.

Constraints

Retraining happens weekly and must finish within 6 hours.
Marketing can tolerate at most a 5% relative drop in conversion for one week.
Raw event data cannot be modified after ingestion, but feature tables are currently mutable.

Context

Current Performance

Metric	Model v17 + Data snapshot 2024-05	Model v18 + overwritten data 2024-06	Change
AUC-ROC	0.81	0.80	-0.01
Log Loss	0.46	0.51	+0.05
Precision @ top 10%	0.29	0.24	-0.05
Recall @ top 10%	0.41	0.36	-0.05
Calibration error	0.03	0.09	+0.06
Campaign conversion rate	3.8%	3.1%	-0.7 pp

The Problem

Requirements

Diagnose what the metrics suggest about the likely failure modes.
Design a versioning approach for raw data, curated features, training datasets, model artifacts, and thresholds.
Specify what metadata must be logged to reproduce any model run and any production prediction batch.
Recommend validation checks before promotion and rollback criteria after deployment.
Explain how your design would support auditability, comparison across versions, and root-cause analysis.

Constraints

Retraining happens weekly and must finish within 6 hours.
Marketing can tolerate at most a 5% relative drop in conversion for one week.
Raw event data cannot be modified after ingestion, but feature tables are currently mutable.

Context

Current Performance

Metric	Model v17 + Data snapshot 2024-05	Model v18 + overwritten data 2024-06	Change
AUC-ROC	0.81	0.80	-0.01
Log Loss	0.46	0.51	+0.05
Precision @ top 10%	0.29	0.24	-0.05
Recall @ top 10%	0.41	0.36	-0.05
Calibration error	0.03	0.09	+0.06
Campaign conversion rate	3.8%	3.1%	-0.7 pp

The Problem

Requirements

Diagnose what the metrics suggest about the likely failure modes.
Design a versioning approach for raw data, curated features, training datasets, model artifacts, and thresholds.
Specify what metadata must be logged to reproduce any model run and any production prediction batch.
Recommend validation checks before promotion and rollback criteria after deployment.
Explain how your design would support auditability, comparison across versions, and root-cause analysis.

Constraints

Retraining happens weekly and must finish within 6 hours.
Marketing can tolerate at most a 5% relative drop in conversion for one week.
Raw event data cannot be modified after ingestion, but feature tables are currently mutable.

Interview Guides

Context

Current Performance

The Problem

Requirements

Constraints

Version Data and Models Reliably

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer

Version Data and Models Reliably

Context

Current Performance

The Problem

Requirements

Constraints

Version Data and Models Reliably

Context

Current Performance

The Problem

Requirements

Constraints

Your Answer