Design Marketplace Listing Ranker

Product Context

Design the ML system behind ranking listings in Facebook Marketplace search and browse surfaces. Buyers expect relevant, fresh, and trustworthy listings, while sellers expect new inventory to become discoverable quickly.

Scale

Signal	Value
DAU touching Marketplace	120M
Peak query/feed QPS	900K
Active listings	350M
New or updated listings/day	18M
Candidate pool before ranking	~20K listings/request
Final results returned	Top 50
End-to-end p99 latency budget	180ms

Task

Design an end-to-end ML system that uses a scalable serving architecture with load balancing, caching, database sharding, and consistency-aware data design, while still following a standard ML retrieval → ranking → re-ranking flow.

Address the following:

Define the functional and non-functional requirements, including freshness, latency, and relevance goals.
Propose the full architecture: offline training, online serving, feature storage, candidate retrieval, ranking, re-ranking, and feedback logging.
Explain how you would use caching, load balancing, and sharded storage for user features, listing features, embeddings, and interaction logs at Meta scale.
Discuss consistency choices across systems: for example, when eventual consistency is acceptable versus when stronger consistency is needed for listing availability, price changes, or policy removals.
Define offline and online evaluation, including how you would detect feature drift, training-serving skew, and regressions in freshness or trust signals.
Identify major failure modes and mitigation plans, especially around stale cache entries, hot shards, feature store outages, and delayed model updates.

Constraints

Fresh listings should become retrievable within 5 minutes of creation.
Removed, sold, or policy-violating listings must stop being served within 1 minute.
Serving should prefer Meta-native infrastructure assumptions such as TAO, Memcache, Kafka/PubSub-style log streams, and regionally distributed inference services.
Cost matters: the design should avoid expensive per-request deep models on the full candidate set.
The system must support graceful degradation during partial outages without returning unsafe or obviously stale results.

Product Context

Scale

Signal	Value
DAU touching Marketplace	120M
Peak query/feed QPS	900K
Active listings	350M
New or updated listings/day	18M
Candidate pool before ranking	~20K listings/request
Final results returned	Top 50
End-to-end p99 latency budget	180ms

Task

Address the following:

Define the functional and non-functional requirements, including freshness, latency, and relevance goals.
Propose the full architecture: offline training, online serving, feature storage, candidate retrieval, ranking, re-ranking, and feedback logging.
Explain how you would use caching, load balancing, and sharded storage for user features, listing features, embeddings, and interaction logs at Meta scale.
Discuss consistency choices across systems: for example, when eventual consistency is acceptable versus when stronger consistency is needed for listing availability, price changes, or policy removals.
Define offline and online evaluation, including how you would detect feature drift, training-serving skew, and regressions in freshness or trust signals.
Identify major failure modes and mitigation plans, especially around stale cache entries, hot shards, feature store outages, and delayed model updates.

Constraints

Fresh listings should become retrievable within 5 minutes of creation.
Removed, sold, or policy-violating listings must stop being served within 1 minute.
Serving should prefer Meta-native infrastructure assumptions such as TAO, Memcache, Kafka/PubSub-style log streams, and regionally distributed inference services.
Cost matters: the design should avoid expensive per-request deep models on the full candidate set.
The system must support graceful degradation during partial outages without returning unsafe or obviously stale results.

Product Context

Scale

Signal	Value
DAU touching Marketplace	120M
Peak query/feed QPS	900K
Active listings	350M
New or updated listings/day	18M
Candidate pool before ranking	~20K listings/request
Final results returned	Top 50
End-to-end p99 latency budget	180ms

Task

Address the following:

Define the functional and non-functional requirements, including freshness, latency, and relevance goals.
Propose the full architecture: offline training, online serving, feature storage, candidate retrieval, ranking, re-ranking, and feedback logging.
Explain how you would use caching, load balancing, and sharded storage for user features, listing features, embeddings, and interaction logs at Meta scale.
Discuss consistency choices across systems: for example, when eventual consistency is acceptable versus when stronger consistency is needed for listing availability, price changes, or policy removals.
Define offline and online evaluation, including how you would detect feature drift, training-serving skew, and regressions in freshness or trust signals.
Identify major failure modes and mitigation plans, especially around stale cache entries, hot shards, feature store outages, and delayed model updates.

Constraints

Fresh listings should become retrievable within 5 minutes of creation.
Removed, sold, or policy-violating listings must stop being served within 1 minute.
Serving should prefer Meta-native infrastructure assumptions such as TAO, Memcache, Kafka/PubSub-style log streams, and regionally distributed inference services.
Cost matters: the design should avoid expensive per-request deep models on the full candidate set.
The system must support graceful degradation during partial outages without returning unsafe or obviously stale results.

Product Context

Scale

Signal	Value
DAU touching Marketplace	120M
Peak query/feed QPS	900K
Active listings	350M
New or updated listings/day	18M
Candidate pool before ranking	~20K listings/request
Final results returned	Top 50
End-to-end p99 latency budget	180ms

Task

Address the following:

Define the functional and non-functional requirements, including freshness, latency, and relevance goals.
Propose the full architecture: offline training, online serving, feature storage, candidate retrieval, ranking, re-ranking, and feedback logging.
Explain how you would use caching, load balancing, and sharded storage for user features, listing features, embeddings, and interaction logs at Meta scale.
Discuss consistency choices across systems: for example, when eventual consistency is acceptable versus when stronger consistency is needed for listing availability, price changes, or policy removals.
Define offline and online evaluation, including how you would detect feature drift, training-serving skew, and regressions in freshness or trust signals.
Identify major failure modes and mitigation plans, especially around stale cache entries, hot shards, feature store outages, and delayed model updates.

Constraints

Fresh listings should become retrievable within 5 minutes of creation.
Removed, sold, or policy-violating listings must stop being served within 1 minute.
Serving should prefer Meta-native infrastructure assumptions such as TAO, Memcache, Kafka/PubSub-style log streams, and regionally distributed inference services.
Cost matters: the design should avoid expensive per-request deep models on the full candidate set.
The system must support graceful degradation during partial outages without returning unsafe or obviously stale results.

Interview Guides

Product Context

Scale

Task

Constraints

Design Marketplace Listing Ranker

Product Context

Scale

Task

Constraints

Your Answer

Design Marketplace Listing Ranker

Product Context

Scale

Task

Constraints

Design Marketplace Listing Ranker

Product Context

Scale

Task

Constraints

Your Answer