Product Context
ShopNow is a large e-commerce marketplace. The team has proposed a single inference API that takes a user ID, fetches all candidate products, computes features inline, scores them with one model, and returns a ranked list for the home feed and search suggestions.
Proposed API Design to Review
POST /rank
- Input:
user_id, page_context, optional query_text, num_results
- Service behavior: fetch up to 200,000 products matching coarse filters, join user and item features at request time, run one deep model over all candidates, apply business rules, return top 50
- Current assumption: one service owns retrieval, ranking, filtering, feature lookup, and logging
Scale
| Signal | Value |
|---|
| DAU | 45M |
| Peak QPS | 120K requests/sec |
| Active catalog | 180M products |
| New/updated products per day | 9M |
| User feature freshness target | < 5 min |
| End-to-end p99 latency budget | 150 ms |
| Availability target | 99.95% |
Task
You are reviewing this API design. Explain where you expect it to break, then propose a production-ready ML system design.
- Identify the main bottlenecks and failure points in the proposed single-service API.
- Redesign the system as an end-to-end multi-stage architecture, including retrieval, ranking, and any re-ranking or policy layers.
- Define the online and offline data paths: feature computation, training data generation, model training cadence, and logging.
- Specify serving architecture choices, latency budget allocation, fallback behavior, and capacity planning.
- Describe how you would evaluate the system offline and online, and how you would monitor drift, skew, and regressions.
- Call out the top operational risks, including feature drift, training-serving skew, stale features, and partial outages.
Constraints
- Query-time joins against the full catalog are too expensive at peak.
- Some features are only available in batch; others must be near-real-time.
- The system must support both signed-in and anonymous users.
- Product availability and price can change within minutes and must not be stale in results.
- Compliance requires auditable business-rule enforcement for blocked sellers and restricted items.