Context
The Facebook Feed team is testing a new ranking tweak that surfaces more creator content near the top of Feed. Early readouts look strong, but the observed treatment effect appears to decay over time.
Hypothesis Seed
Product believes the new ranking will increase meaningful engagement in Facebook Feed by improving content relevance. In a 50/50 experiment, the team sees roughly +5% relative lift in week 1 on the primary engagement metric, but only +1% relative lift by week 4 when looking at cumulative results. Leadership asks whether this is a real win, a novelty effect, or evidence the treatment should not ship.
Constraints
- Eligible traffic: 12M Facebook Feed DAU globally
- Max experiment runtime: 28 days; a decision is needed at the end of week 4
- Allocation: 50/50 after a 1-day ramp
- Baseline primary metric: 0.240 daily probability that a user has at least one meaningful social interaction (MSI) from Feed
- Small false positives are costly because ranking launches affect a large share of Feed impressions; false negatives are also costly because ranking improvements compound over time
- You must pre-register the analysis plan before launch and cannot change the primary metric after seeing week-1 results
Deliverables
- Define the hypothesis, primary metric, guardrails, and a realistic MDE for this Facebook Feed experiment.
- Calculate the required sample size per arm and confirm whether 28 days of traffic is sufficient.
- Choose the unit of randomization and explain how you would analyze the experiment given repeated observations per user over 4 weeks.
- Specify a pre-registered analysis plan covering the week-1 vs week-4 discrepancy, peeking policy, multiple comparisons, and the exact statistical test.
- State what you would conclude from the pattern +5% in week 1, +1% in week 4, including when you would ship, not ship, or run a follow-up holdout.