Product Context
ShopStream is a large ecommerce app whose home feed and product detail recommendations are currently served by a single monolithic service. The company wants to decompose it into microservices while preserving recommendation quality, latency, and operational reliability.
Scale
| Signal | Value |
|---|
| DAU | 35M |
| Peak recommendation QPS | 180K |
| Product catalog | 120M active SKUs |
| New/updated items per day | 4M |
| Avg candidates scored per request | 2K |
| End-to-end p99 latency budget | 180ms |
Task
Design the target ML system and service boundaries for a personalized recommendation stack after decomposing the monolith. Your design should address both system architecture and ML lifecycle concerns.
- Define the microservices you would create and the APIs or communication patterns between them.
- Propose an end-to-end recommendation architecture, including candidate generation, ranking, and re-ranking.
- Explain what should run online vs batch, and how features, models, and indexes are produced and served.
- Describe how you would evaluate the system offline and online during migration from the monolith.
- Identify key failure modes introduced by service decomposition, including feature drift and training-serving skew, and how you would monitor and mitigate them.
- Discuss migration strategy, fallbacks, and how to keep the system available if one downstream service is slow or unavailable.
Constraints
- 25% of traffic comes from anonymous or low-history users.
- Product freshness matters: inventory, price, and promotions change within minutes.
- The business requires graceful degradation: if personalization fails, the app must still return recommendations.
- Infra cost cannot increase by more than 30% versus the monolith during the first migration phase.
- User-level training data must remain in-region for compliance, so some features/models may be region-specific.