Interview Guides

Operationalize Model Deployment Pipeline

Easy

Pipelines

Context

AcmeRisk, a fintech company, trains fraud detection models weekly but deploys them manually through ad hoc scripts and ticket-based handoffs. The current process causes inconsistent model artifacts, missing lineage, and delayed rollbacks when a bad model reaches production.

You need to design a production-grade data pipeline that moves trained models from the ML training environment into the broader software ecosystem: batch scoring jobs, a low-latency online inference service, monitoring tables, and downstream analytics. The company already uses AWS, Airflow, S3, Docker, Kubernetes, and Snowflake.

Scale Requirements

Training output: 20 model candidates/day, each artifact 200MB-1.5GB
Online traffic: 8K predictions/sec average, 25K/sec peak
Batch scoring: 120M transactions/day, SLA < 2 hours
Deployment latency: approved model available for online serving in < 15 minutes
Retention: model artifacts and metadata retained for 1 year
Availability target: 99.9% for inference endpoints

Requirements

Design an ETL/ELT-style deployment pipeline that ingests model artifacts, validation reports, and metadata from training jobs.
Register versioned models and promote them through staging and production with reproducible lineage.
Support both online serving on Kubernetes and batch scoring pipelines writing results to Snowflake.
Include automated validation gates: schema checks, feature compatibility, performance thresholds, and canary deployment checks.
Ensure idempotent deployments, rollback support, and auditability for compliance reviews.
Expose deployment status and model health to engineering and data teams.

Constraints

AWS-first environment; no multi-cloud design
Small platform team: 3 data engineers, 2 ML engineers
Monthly incremental infrastructure budget: $18K
Must satisfy SOC 2 audit requirements with immutable deployment logs
Feature definitions are maintained separately in a dbt/Snowflake analytics stack, so training-serving skew must be detected automatically

Operationalize Model Deployment Pipeline

Easy

Pipelines

Context

Scale Requirements

Training output: 20 model candidates/day, each artifact 200MB-1.5GB
Online traffic: 8K predictions/sec average, 25K/sec peak
Batch scoring: 120M transactions/day, SLA < 2 hours
Deployment latency: approved model available for online serving in < 15 minutes
Retention: model artifacts and metadata retained for 1 year
Availability target: 99.9% for inference endpoints

Requirements

Design an ETL/ELT-style deployment pipeline that ingests model artifacts, validation reports, and metadata from training jobs.
Register versioned models and promote them through staging and production with reproducible lineage.
Support both online serving on Kubernetes and batch scoring pipelines writing results to Snowflake.
Include automated validation gates: schema checks, feature compatibility, performance thresholds, and canary deployment checks.
Ensure idempotent deployments, rollback support, and auditability for compliance reviews.
Expose deployment status and model health to engineering and data teams.

Constraints

AWS-first environment; no multi-cloud design
Small platform team: 3 data engineers, 2 ML engineers
Monthly incremental infrastructure budget: $18K
Must satisfy SOC 2 audit requirements with immutable deployment logs
Feature definitions are maintained separately in a dbt/Snowflake analytics stack, so training-serving skew must be detected automatically

Your Answer

Operationalize Model Deployment Pipeline

Easy

Pipelines

Context

Scale Requirements

Training output: 20 model candidates/day, each artifact 200MB-1.5GB
Online traffic: 8K predictions/sec average, 25K/sec peak
Batch scoring: 120M transactions/day, SLA < 2 hours
Deployment latency: approved model available for online serving in < 15 minutes
Retention: model artifacts and metadata retained for 1 year
Availability target: 99.9% for inference endpoints

Requirements

Design an ETL/ELT-style deployment pipeline that ingests model artifacts, validation reports, and metadata from training jobs.
Register versioned models and promote them through staging and production with reproducible lineage.
Support both online serving on Kubernetes and batch scoring pipelines writing results to Snowflake.
Include automated validation gates: schema checks, feature compatibility, performance thresholds, and canary deployment checks.
Ensure idempotent deployments, rollback support, and auditability for compliance reviews.
Expose deployment status and model health to engineering and data teams.

Constraints

AWS-first environment; no multi-cloud design
Small platform team: 3 data engineers, 2 ML engineers
Monthly incremental infrastructure budget: $18K
Must satisfy SOC 2 audit requirements with immutable deployment logs
Feature definitions are maintained separately in a dbt/Snowflake analytics stack, so training-serving skew must be detected automatically

Operationalize Model Deployment Pipeline

Easy

Pipelines

Context

Scale Requirements

Training output: 20 model candidates/day, each artifact 200MB-1.5GB
Online traffic: 8K predictions/sec average, 25K/sec peak
Batch scoring: 120M transactions/day, SLA < 2 hours
Deployment latency: approved model available for online serving in < 15 minutes
Retention: model artifacts and metadata retained for 1 year
Availability target: 99.9% for inference endpoints

Requirements

Design an ETL/ELT-style deployment pipeline that ingests model artifacts, validation reports, and metadata from training jobs.
Register versioned models and promote them through staging and production with reproducible lineage.
Support both online serving on Kubernetes and batch scoring pipelines writing results to Snowflake.
Include automated validation gates: schema checks, feature compatibility, performance thresholds, and canary deployment checks.
Ensure idempotent deployments, rollback support, and auditability for compliance reviews.
Expose deployment status and model health to engineering and data teams.

Constraints

AWS-first environment; no multi-cloud design
Small platform team: 3 data engineers, 2 ML engineers
Monthly incremental infrastructure budget: $18K
Must satisfy SOC 2 audit requirements with immutable deployment logs
Feature definitions are maintained separately in a dbt/Snowflake analytics stack, so training-serving skew must be detected automatically