Switchback Test for Server Discovery

Context

Discord is considering a new Server Discovery ranking and highlighting treatment that boosts recently active communities and surfaces richer social proof in Discovery. Because the feature changes which servers many users join and interact with, it may alter shared community behavior and create interference across users.

Hypothesis Seed

The team believes the new Discovery treatment will increase the rate at which eligible users join a server from Discovery, and improve downstream 7-day engagement in joined servers. However, if many users are simultaneously exposed, server activity itself may change, contaminating a standard user-level A/B test. You need to decide whether a switchback test is the right design, and if so, how to run it.

Constraints

Eligible traffic: 1.2M Discovery sessions/day globally
About 180K unique servers/day receive meaningful Discovery impressions
Maximum experiment window: 21 days including ramp
Product wants a decision within this window for a quarterly launch review
False positives are costly because the ranking could shift traffic toward low-quality or poorly moderated servers; false negatives are acceptable if they avoid ecosystem harm
Weekly seasonality is strong, and server activity varies by hour

Tasks

Decide whether to use a switchback test or a standard randomized A/B test. Explicitly discuss network effects, SUTVA, and the unit of interference.
Define the primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Compute the required sample size / number of switchback periods using real numbers, then translate that into a feasible duration under the traffic constraints.
Specify the experiment design: unit of randomization, switchback cadence, allocation, stratification, and how you will monitor SRM or assignment bugs.
Pre-register the analysis plan and a clear ship / don’t-ship / iterate rule that respects guardrails, peeking policy, and multiple comparisons.

Context

Hypothesis Seed

Constraints

Eligible traffic: 1.2M Discovery sessions/day globally
About 180K unique servers/day receive meaningful Discovery impressions
Maximum experiment window: 21 days including ramp
Product wants a decision within this window for a quarterly launch review
False positives are costly because the ranking could shift traffic toward low-quality or poorly moderated servers; false negatives are acceptable if they avoid ecosystem harm
Weekly seasonality is strong, and server activity varies by hour

Tasks

Decide whether to use a switchback test or a standard randomized A/B test. Explicitly discuss network effects, SUTVA, and the unit of interference.
Define the primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Compute the required sample size / number of switchback periods using real numbers, then translate that into a feasible duration under the traffic constraints.
Specify the experiment design: unit of randomization, switchback cadence, allocation, stratification, and how you will monitor SRM or assignment bugs.
Pre-register the analysis plan and a clear ship / don’t-ship / iterate rule that respects guardrails, peeking policy, and multiple comparisons.

Context

Hypothesis Seed

Constraints

Eligible traffic: 1.2M Discovery sessions/day globally
About 180K unique servers/day receive meaningful Discovery impressions
Maximum experiment window: 21 days including ramp
Product wants a decision within this window for a quarterly launch review
False positives are costly because the ranking could shift traffic toward low-quality or poorly moderated servers; false negatives are acceptable if they avoid ecosystem harm
Weekly seasonality is strong, and server activity varies by hour

Tasks

Decide whether to use a switchback test or a standard randomized A/B test. Explicitly discuss network effects, SUTVA, and the unit of interference.
Define the primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Compute the required sample size / number of switchback periods using real numbers, then translate that into a feasible duration under the traffic constraints.
Specify the experiment design: unit of randomization, switchback cadence, allocation, stratification, and how you will monitor SRM or assignment bugs.
Pre-register the analysis plan and a clear ship / don’t-ship / iterate rule that respects guardrails, peeking policy, and multiple comparisons.

Context

Hypothesis Seed

Constraints

Eligible traffic: 1.2M Discovery sessions/day globally
About 180K unique servers/day receive meaningful Discovery impressions
Maximum experiment window: 21 days including ramp
Product wants a decision within this window for a quarterly launch review
False positives are costly because the ranking could shift traffic toward low-quality or poorly moderated servers; false negatives are acceptable if they avoid ecosystem harm
Weekly seasonality is strong, and server activity varies by hour

Tasks

Decide whether to use a switchback test or a standard randomized A/B test. Explicitly discuss network effects, SUTVA, and the unit of interference.
Define the primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Compute the required sample size / number of switchback periods using real numbers, then translate that into a feasible duration under the traffic constraints.
Specify the experiment design: unit of randomization, switchback cadence, allocation, stratification, and how you will monitor SRM or assignment bugs.
Pre-register the analysis plan and a clear ship / don’t-ship / iterate rule that respects guardrails, peeking policy, and multiple comparisons.

Interview Guides

Context

Hypothesis Seed

Constraints

Tasks

Switchback Test for Server Discovery

Context

Hypothesis Seed

Constraints

Tasks

Your Answer

Switchback Test for Server Discovery

Context

Hypothesis Seed

Constraints

Tasks

Switchback Test for Server Discovery

Context

Hypothesis Seed

Constraints

Tasks

Your Answer