Product Context
Instagram Stories is a high-frequency surface where users quickly consume ephemeral content from accounts they follow and accounts Instagram may recommend. Design an end-to-end ML system that ranks Stories for each viewer when they open the Stories tray or advance to the next Story.
Scale
| Signal | Value |
|---|
| DAU | 600M Instagram users viewing Stories daily |
| Peak QPS | 2.0M Story ranking requests/sec globally |
| Active Story inventory | ~180M live Stories in a 24-hour window |
| New Stories/day | ~1.2B |
| Candidate set before ranking | 5K-20K per request |
| End-to-end p99 latency budget | 120ms |
Task
Design the recommendation system and explain the major tradeoffs. Address the following:
- Define the product objective, request types, and key functional/non-functional requirements.
- Size the system and propose a multi-stage architecture for retrieval, ranking, and re-ranking.
- Choose models and features for each stage, including how you handle follow-graph signals, freshness, and cold start.
- Describe the online serving path versus offline training/data pipelines, including Meta-specific infrastructure assumptions where relevant.
- Define offline and online evaluation, experiment design, and rollout strategy.
- Identify major failure modes such as feature drift, training-serving skew, stale content, and monitoring gaps.
Constraints
- Stories expire after 24 hours, so freshness matters more than long-tail historical content.
- The system must respect privacy, user blocks/mutes, integrity filters, and age/region policies before final ranking.
- Most requests happen in bursts during app opens, so tail latency and cache effectiveness are critical.
- Cost matters: the heaviest model cannot run over the full live Story inventory on every request.
- New users and new creators may have little interaction history, but the system must still produce a high-quality tray.