Decompose ML Recommendations Monolith

Product Context

ShopStream is a large ecommerce app whose home feed and product detail recommendations are currently served by a single monolithic service. The company wants to decompose it into microservices while preserving recommendation quality, latency, and operational reliability.

Scale

Signal	Value
DAU	35M
Peak recommendation QPS	180K
Product catalog	120M active SKUs
New/updated items per day	4M
Avg candidates scored per request	2K
End-to-end p99 latency budget	180ms

Task

Design the target ML system and service boundaries for a personalized recommendation stack after decomposing the monolith. Your design should address both system architecture and ML lifecycle concerns.

Define the microservices you would create and the APIs or communication patterns between them.
Propose an end-to-end recommendation architecture, including candidate generation, ranking, and re-ranking.
Explain what should run online vs batch, and how features, models, and indexes are produced and served.
Describe how you would evaluate the system offline and online during migration from the monolith.
Identify key failure modes introduced by service decomposition, including feature drift and training-serving skew, and how you would monitor and mitigate them.
Discuss migration strategy, fallbacks, and how to keep the system available if one downstream service is slow or unavailable.

Constraints

25% of traffic comes from anonymous or low-history users.
Product freshness matters: inventory, price, and promotions change within minutes.
The business requires graceful degradation: if personalization fails, the app must still return recommendations.
Infra cost cannot increase by more than 30% versus the monolith during the first migration phase.
User-level training data must remain in-region for compliance, so some features/models may be region-specific.

Signal

Value

DAU

35M

Peak recommendation QPS

180K

Product catalog

120M active SKUs

New/updated items per day

Avg candidates scored per request

End-to-end p99 latency budget

180ms

Task

Define the microservices you would create and the APIs or communication patterns between them.

Propose an end-to-end recommendation architecture, including candidate generation, ranking, and re-ranking.

Explain what should run online vs batch, and how features, models, and indexes are produced and served.

Describe how you would evaluate the system offline and online during migration from the monolith.

Identify key failure modes introduced by service decomposition, including feature drift and training-serving skew, and how you would monitor and mitigate them.

Discuss migration strategy, fallbacks, and how to keep the system available if one downstream service is slow or unavailable.

Constraints

25% of traffic comes from anonymous or low-history users.

Product freshness matters: inventory, price, and promotions change within minutes.

The business requires graceful degradation: if personalization fails, the app must still return recommendations.

Infra cost cannot increase by more than 30% versus the monolith during the first migration phase.

User-level training data must remain in-region for compliance, so some features/models may be region-specific.

Signal

Value

DAU

35M

Peak recommendation QPS

180K

Product catalog

120M active SKUs

New/updated items per day

Avg candidates scored per request

End-to-end p99 latency budget

180ms

Task

Define the microservices you would create and the APIs or communication patterns between them.

Propose an end-to-end recommendation architecture, including candidate generation, ranking, and re-ranking.

Explain what should run online vs batch, and how features, models, and indexes are produced and served.

Describe how you would evaluate the system offline and online during migration from the monolith.

Identify key failure modes introduced by service decomposition, including feature drift and training-serving skew, and how you would monitor and mitigate them.

Discuss migration strategy, fallbacks, and how to keep the system available if one downstream service is slow or unavailable.

Constraints

25% of traffic comes from anonymous or low-history users.

Product freshness matters: inventory, price, and promotions change within minutes.

The business requires graceful degradation: if personalization fails, the app must still return recommendations.

Infra cost cannot increase by more than 30% versus the monolith during the first migration phase.

User-level training data must remain in-region for compliance, so some features/models may be region-specific.

Signal

Value

DAU

35M

Peak recommendation QPS

180K

Product catalog

120M active SKUs

New/updated items per day

Avg candidates scored per request

End-to-end p99 latency budget

180ms

Task

Define the microservices you would create and the APIs or communication patterns between them.

Propose an end-to-end recommendation architecture, including candidate generation, ranking, and re-ranking.

Explain what should run online vs batch, and how features, models, and indexes are produced and served.

Describe how you would evaluate the system offline and online during migration from the monolith.

Identify key failure modes introduced by service decomposition, including feature drift and training-serving skew, and how you would monitor and mitigate them.

Discuss migration strategy, fallbacks, and how to keep the system available if one downstream service is slow or unavailable.

Constraints

25% of traffic comes from anonymous or low-history users.

Product freshness matters: inventory, price, and promotions change within minutes.

The business requires graceful degradation: if personalization fails, the app must still return recommendations.

Infra cost cannot increase by more than 30% versus the monolith during the first migration phase.

User-level training data must remain in-region for compliance, so some features/models may be region-specific.

Interview Guides

Product Context

Scale

Task

Constraints

Decompose ML Recommendations Monolith

Product Context

Scale

Task

Constraints

Your Answer

Decompose ML Recommendations Monolith

Product Context

Scale

Task

Constraints

Decompose ML Recommendations Monolith

Product Context

Scale

Task

Constraints

Your Answer