Context
The Facebook Reels team wants to test a new interactive sticker shown only in a narrow eligibility slice: about 5% of daily Reels viewers are eligible to ever see it. Leadership wants a launch decision this month.
Hypothesis Seed
The team believes the sticker will increase Reels engagement among eligible viewers by making creation and response behavior more interactive. However, because only 5% of users are eligible, you are concerned about whether the experiment will have enough power, whether spillovers could contaminate control, and whether a null result would be interpretable.
Constraints
- Eligible traffic: 250,000 Facebook users per day globally
- Only eligible users can be randomized; the other 95% never see the feature
- Maximum experiment duration: 14 days
- Desired significance level: 0.05, power: 80%
- Baseline primary metric for eligible users: 18.0% of eligible viewers send at least one Reel share within 7 days
- Smallest business-relevant lift: 3% relative
- False positives are costly because the feature adds engineering and moderation overhead; false negatives are acceptable only if the test is clearly underpowered
Deliverables
- Define the hypothesis, primary metric, secondary metrics, and guardrails for this Facebook Reels experiment.
- Calculate whether the experiment is adequately powered given that only 5% of users are eligible, and translate the required sample size into runtime.
- Choose the unit of randomization and explain how you would handle interference risk if treated users share Reels with control users.
- Pre-register the analysis plan: statistical test, peeking policy, multiple-comparison policy, and SRM checks.
- State a clear ship / don't-ship / iterate rule that respects both the primary metric and guardrails.