Product Context
ShopStream is a mid-size e-commerce marketplace that wants a personalized product recommendation engine for its home page and product detail pages. The company has basic event logs and catalog data, but the data pipeline is inconsistent across teams and there is no trusted feature store.
Scale
| Signal | Value |
|---|
| DAU | 18M |
| Peak recommendation QPS | 45K |
| Product catalog | 35M active SKUs |
| New/updated items per day | 1.2M |
| Average recommendations per request | 20 |
| End-to-end p99 latency budget | 180ms |
Task
Design an end-to-end recommendation system, but start from the reality that the client does not have a clean data pipeline today.
- Clarify the minimum product requirements, success metrics, and what you would build first versus defer.
- Propose how you would establish a reliable data foundation before or alongside model development, including logging, feature definitions, backfills, and data quality checks.
- Design the online recommendation architecture from candidate generation to ranking to optional re-ranking, with realistic latency and throughput assumptions.
- Choose models for each stage and explain why they fit the available data maturity, cold-start constraints, and catalog scale.
- Define offline and online evaluation, including how you would deal with delayed labels, feature drift, and training-serving skew.
- Identify major failure modes and how you would monitor and mitigate them in production.
Constraints
- Existing click and purchase logs have missing fields, duplicate events, and inconsistent user IDs across web and mobile.
- The business wants an MVP in 12 weeks, so the first version cannot depend on a perfect long-term data platform.
- User-level features must respect privacy policy; raw PII cannot be used directly in training or online serving.
- Fresh inventory matters: newly added products should become eligible for recommendation within 30 minutes.
- Serving cost must stay below roughly $0.001 per recommendation request, so expensive per-request deep models are discouraged for the first launch.