You are leading the design of a personalized recommendations system for a large e-commerce marketplace. The system powers multiple surfaces such as the home page, product detail page, and post-search recommendations, and it is expected to improve conversion while keeping experiences relevant and fresh. You need to choose the architectural patterns you prefer across retrieval, ranking, and re-ranking, and explain why they fit this product and scale. The business expects the system to handle fast-changing inventory, seasonal demand shifts, and cold-start users and items without degrading customer experience.
| Signal | Value |
|---|---|
| DAU | 65M |
| Peak recommendation QPS | 220K |
| Active catalog | 180M products |
| New or updated products/day | 9M |
| Per-request latency budget (p99) | 180ms |
| Recommendation slots/request | 20 |
How would you design this end-to-end ML system, and which architectural patterns would you choose across data, modeling, and serving layers to balance relevance, freshness, cost, and operational reliability at this scale?