Design Graph API Rate Limiter

Problem

Design a rate-limiter for the Meta Graph API that must enforce limits at millions of requests per second across multiple regions. The system is part of Meta's security infrastructure, so the goal is not only fairness and abuse prevention, but also resilience during attacks, partial outages, and sudden traffic spikes.

Requirements

Enforce limits for multiple dimensions, such as app ID, user access token, IP / subnet, and API endpoint.
Support different policies: global quotas, per-second burst limits, and rolling-window limits.
Make allow/deny decisions with very low latency on the request path.
Remain correct enough under regional failover, cache loss, clock skew, and backend degradation.
Prevent common bypasses such as key rotation, distributed abuse across many IPs, and hot-key concentration.

What to Cover

Explain your design for:

request-path architecture and where enforcement happens
choice of algorithm (for example token bucket, leaky bucket, sliding window log/counter)
state storage strategy for counters at very high QPS
sharding, replication, and multi-region behavior
consistency model and acceptable error bounds
handling of retries, idempotency, and race conditions
observability, alerting, and operational controls
security concerns, including abuse detection hooks and fail-open vs fail-closed decisions

Example

A single app suddenly sends 8M requests/sec to a write-heavy Graph API endpoint from many edge locations. Describe how your system detects the surge, applies the correct per-app and per-endpoint limits, avoids overloading shared infrastructure, and still protects legitimate traffic.

Be explicit about trade-offs. A strong answer should separate the fast path from the control plane and justify where approximate counting is acceptable versus where strict enforcement is required.

Problem

Requirements

Enforce limits for multiple dimensions, such as app ID, user access token, IP / subnet, and API endpoint.
Support different policies: global quotas, per-second burst limits, and rolling-window limits.
Make allow/deny decisions with very low latency on the request path.
Remain correct enough under regional failover, cache loss, clock skew, and backend degradation.
Prevent common bypasses such as key rotation, distributed abuse across many IPs, and hot-key concentration.

What to Cover

Explain your design for:

request-path architecture and where enforcement happens
choice of algorithm (for example token bucket, leaky bucket, sliding window log/counter)
state storage strategy for counters at very high QPS
sharding, replication, and multi-region behavior
consistency model and acceptable error bounds
handling of retries, idempotency, and race conditions
observability, alerting, and operational controls
security concerns, including abuse detection hooks and fail-open vs fail-closed decisions

Example

Be explicit about trade-offs. A strong answer should separate the fast path from the control plane and justify where approximate counting is acceptable versus where strict enforcement is required.

Problem

Requirements

Enforce limits for multiple dimensions, such as app ID, user access token, IP / subnet, and API endpoint.
Support different policies: global quotas, per-second burst limits, and rolling-window limits.
Make allow/deny decisions with very low latency on the request path.
Remain correct enough under regional failover, cache loss, clock skew, and backend degradation.
Prevent common bypasses such as key rotation, distributed abuse across many IPs, and hot-key concentration.

What to Cover

Explain your design for:

request-path architecture and where enforcement happens
choice of algorithm (for example token bucket, leaky bucket, sliding window log/counter)
state storage strategy for counters at very high QPS
sharding, replication, and multi-region behavior
consistency model and acceptable error bounds
handling of retries, idempotency, and race conditions
observability, alerting, and operational controls
security concerns, including abuse detection hooks and fail-open vs fail-closed decisions

Example

Be explicit about trade-offs. A strong answer should separate the fast path from the control plane and justify where approximate counting is acceptable versus where strict enforcement is required.

Problem

Requirements

Enforce limits for multiple dimensions, such as app ID, user access token, IP / subnet, and API endpoint.
Support different policies: global quotas, per-second burst limits, and rolling-window limits.
Make allow/deny decisions with very low latency on the request path.
Remain correct enough under regional failover, cache loss, clock skew, and backend degradation.
Prevent common bypasses such as key rotation, distributed abuse across many IPs, and hot-key concentration.

What to Cover

Explain your design for:

request-path architecture and where enforcement happens
choice of algorithm (for example token bucket, leaky bucket, sliding window log/counter)
state storage strategy for counters at very high QPS
sharding, replication, and multi-region behavior
consistency model and acceptable error bounds
handling of retries, idempotency, and race conditions
observability, alerting, and operational controls
security concerns, including abuse detection hooks and fail-open vs fail-closed decisions

Example

Be explicit about trade-offs. A strong answer should separate the fast path from the control plane and justify where approximate counting is acceptable versus where strict enforcement is required.

Interview Guides

Problem

Requirements

What to Cover

Example

Design Graph API Rate Limiter

Problem

Requirements

What to Cover

Example

Your Answer

Design Graph API Rate Limiter

Problem

Requirements

What to Cover

Example

Design Graph API Rate Limiter

Problem

Requirements

What to Cover

Example

Your Answer