Design Privacy-Constrained Multi-Tenant Recommender

Product Context

AtlasRec provides a recommendations platform for enterprise customers. Each tenant is a separate retailer or media app that wants personalized recommendations on its website and mobile app, but tenant data cannot leak across customers and some tenants require all data storage, training, and serving to stay within a specific region.

Scale

Signal	Value
Tenants	1,200 total; 150 large, 1,050 SMB
Total DAU	90M across all tenants
Peak recommendation QPS	220K global
Largest tenant peak QPS	18K
Catalog size	600M items total; largest tenant 80M
New interaction events/day	9B
Per-request latency budget (p99)	180ms
Residency regions	US, EU, India, Singapore

Task

Design an end-to-end multi-tenant recommendations service that satisfies strict privacy and data residency constraints. Address the following:

Clarify product requirements, tenant isolation guarantees, and what level of cross-tenant sharing is allowed, if any.
Propose a multi-stage recommendation architecture (retrieval → ranking → re-ranking) that works for both large tenants with rich data and small tenants with sparse data.
Design the offline and online data/feature architecture, including how training, feature computation, and model serving respect regional residency rules.
Explain how you would handle cold start, model updates, and feature freshness without introducing training-serving skew.
Define offline and online evaluation, including tenant-level metrics, privacy/compliance guardrails, and rollout strategy.
Identify major failure modes such as data leakage, stale regional models, feature drift, and regional outages, with detection and mitigation.

Constraints

No raw user-level data may move across tenant boundaries.
Some tenants prohibit even model parameter sharing across regions; assume the strictest tenants require fully region-local training and serving.
p99 latency must stay under 180ms, including policy filtering.
Cost matters: the platform must support many small tenants without dedicating a full GPU fleet per tenant.
Auditable compliance is required: every feature, model, and request path must be attributable to a tenant and region.

Product Context

Scale

Signal	Value
Tenants	1,200 total; 150 large, 1,050 SMB
Total DAU	90M across all tenants
Peak recommendation QPS	220K global
Largest tenant peak QPS	18K
Catalog size	600M items total; largest tenant 80M
New interaction events/day	9B
Per-request latency budget (p99)	180ms
Residency regions	US, EU, India, Singapore

Task

Design an end-to-end multi-tenant recommendations service that satisfies strict privacy and data residency constraints. Address the following:

Clarify product requirements, tenant isolation guarantees, and what level of cross-tenant sharing is allowed, if any.
Propose a multi-stage recommendation architecture (retrieval → ranking → re-ranking) that works for both large tenants with rich data and small tenants with sparse data.
Design the offline and online data/feature architecture, including how training, feature computation, and model serving respect regional residency rules.
Explain how you would handle cold start, model updates, and feature freshness without introducing training-serving skew.
Define offline and online evaluation, including tenant-level metrics, privacy/compliance guardrails, and rollout strategy.
Identify major failure modes such as data leakage, stale regional models, feature drift, and regional outages, with detection and mitigation.

Constraints

No raw user-level data may move across tenant boundaries.
Some tenants prohibit even model parameter sharing across regions; assume the strictest tenants require fully region-local training and serving.
p99 latency must stay under 180ms, including policy filtering.
Cost matters: the platform must support many small tenants without dedicating a full GPU fleet per tenant.
Auditable compliance is required: every feature, model, and request path must be attributable to a tenant and region.

Product Context

Scale

Signal	Value
Tenants	1,200 total; 150 large, 1,050 SMB
Total DAU	90M across all tenants
Peak recommendation QPS	220K global
Largest tenant peak QPS	18K
Catalog size	600M items total; largest tenant 80M
New interaction events/day	9B
Per-request latency budget (p99)	180ms
Residency regions	US, EU, India, Singapore

Task

Design an end-to-end multi-tenant recommendations service that satisfies strict privacy and data residency constraints. Address the following:

Clarify product requirements, tenant isolation guarantees, and what level of cross-tenant sharing is allowed, if any.
Propose a multi-stage recommendation architecture (retrieval → ranking → re-ranking) that works for both large tenants with rich data and small tenants with sparse data.
Design the offline and online data/feature architecture, including how training, feature computation, and model serving respect regional residency rules.
Explain how you would handle cold start, model updates, and feature freshness without introducing training-serving skew.
Define offline and online evaluation, including tenant-level metrics, privacy/compliance guardrails, and rollout strategy.
Identify major failure modes such as data leakage, stale regional models, feature drift, and regional outages, with detection and mitigation.

Constraints

No raw user-level data may move across tenant boundaries.
Some tenants prohibit even model parameter sharing across regions; assume the strictest tenants require fully region-local training and serving.
p99 latency must stay under 180ms, including policy filtering.
Cost matters: the platform must support many small tenants without dedicating a full GPU fleet per tenant.
Auditable compliance is required: every feature, model, and request path must be attributable to a tenant and region.

Product Context

Scale

Signal	Value
Tenants	1,200 total; 150 large, 1,050 SMB
Total DAU	90M across all tenants
Peak recommendation QPS	220K global
Largest tenant peak QPS	18K
Catalog size	600M items total; largest tenant 80M
New interaction events/day	9B
Per-request latency budget (p99)	180ms
Residency regions	US, EU, India, Singapore

Task

Design an end-to-end multi-tenant recommendations service that satisfies strict privacy and data residency constraints. Address the following:

Clarify product requirements, tenant isolation guarantees, and what level of cross-tenant sharing is allowed, if any.
Propose a multi-stage recommendation architecture (retrieval → ranking → re-ranking) that works for both large tenants with rich data and small tenants with sparse data.
Design the offline and online data/feature architecture, including how training, feature computation, and model serving respect regional residency rules.
Explain how you would handle cold start, model updates, and feature freshness without introducing training-serving skew.
Define offline and online evaluation, including tenant-level metrics, privacy/compliance guardrails, and rollout strategy.
Identify major failure modes such as data leakage, stale regional models, feature drift, and regional outages, with detection and mitigation.

Constraints

No raw user-level data may move across tenant boundaries.
Some tenants prohibit even model parameter sharing across regions; assume the strictest tenants require fully region-local training and serving.
p99 latency must stay under 180ms, including policy filtering.
Cost matters: the platform must support many small tenants without dedicating a full GPU fleet per tenant.
Auditable compliance is required: every feature, model, and request path must be attributable to a tenant and region.

Interview Guides

Product Context

Scale

Task

Constraints

Design Privacy-Constrained Multi-Tenant Recommender

Product Context

Scale

Task

Constraints

Your Answer

Design Privacy-Constrained Multi-Tenant Recommender

Product Context

Scale

Task

Constraints

Design Privacy-Constrained Multi-Tenant Recommender

Product Context

Scale

Task

Constraints

Your Answer