Choose Randomization for Reels Save Test

Context

The Instagram Reels team at Meta wants to test a new ranking feature that surfaces more “save-worthy” Reels by upweighting predicted long-term value. The launch decision depends on whether the experiment improves downstream engagement without harming the broader AARRR funnel.

Hypothesis Seed

The team believes the new ranker will increase IG Save rate on Reels because users will see more collectible or revisit-worthy content. However, the change may also alter creator distribution, session depth, and social interactions, so the choice of unit of randomization is critical.

Constraints

Eligible traffic: 18M daily Reels viewers globally
Average eligible exposure: 2.4 Reels sessions per viewer per day
Current baseline IG Save rate per viewer-day: 8.0%
Maximum experiment duration: 14 days, including a 1-day ramp
Decision deadline: must recommend ship / don’t ship / iterate by day 15
False positives are costly because a bad ranker can degrade Reels retention and creator ecosystem health; false negatives are also costly because Reels is a strategic surface
Meta experimentation platform supports user-level assignment, creator-level assignment, and geo holdouts; CUPED is available using 14-day pre-experiment viewer behavior

Deliverables

Choose and justify the unit of randomization for this Reels experiment. Explicitly compare at least two alternatives (for example: viewer_id, creator_id, or geo) and discuss SUTVA / network interference trade-offs.
Define the primary metric, 2-4 guardrails, and at least one secondary metric using Meta vocabulary (for example IG Save, Reels session depth, retention, creator distribution).
Calculate the required sample size for a pre-registered MDE and convert it into expected runtime given the traffic constraints. State how CUPED changes variance and effective runtime.
Write the analysis plan: test choice, SRM checks, peeking policy, novelty effect handling, and a ship / don’t-ship rule that respects guardrails.
Call out the top pitfalls specific to this design, especially when the unit of analysis differs from the unit of randomization.

Constraints

Eligible traffic: 18M daily Reels viewers globally

Average eligible exposure: 2.4 Reels sessions per viewer per day

Current baseline IG Save rate per viewer-day: 8.0%

Maximum experiment duration: 14 days, including a 1-day ramp

Decision deadline: must recommend ship / don’t ship / iterate by day 15

False positives are costly because a bad ranker can degrade Reels retention and creator ecosystem health; false negatives are also costly because Reels is a strategic surface

Meta experimentation platform supports user-level assignment, creator-level assignment, and geo holdouts; CUPED is available using 14-day pre-experiment viewer behavior

Deliverables

Choose and justify the unit of randomization for this Reels experiment. Explicitly compare at least two alternatives (for example: viewer_id, creator_id, or geo) and discuss SUTVA / network interference trade-offs.

Define the primary metric, 2-4 guardrails, and at least one secondary metric using Meta vocabulary (for example IG Save, Reels session depth, retention, creator distribution).

Calculate the required sample size for a pre-registered MDE and convert it into expected runtime given the traffic constraints. State how CUPED changes variance and effective runtime.

Write the analysis plan: test choice, SRM checks, peeking policy, novelty effect handling, and a ship / don’t-ship rule that respects guardrails.

Call out the top pitfalls specific to this design, especially when the unit of analysis differs from the unit of randomization.

Constraints

Eligible traffic: 18M daily Reels viewers globally

Average eligible exposure: 2.4 Reels sessions per viewer per day

Current baseline IG Save rate per viewer-day: 8.0%

Maximum experiment duration: 14 days, including a 1-day ramp

Decision deadline: must recommend ship / don’t ship / iterate by day 15

False positives are costly because a bad ranker can degrade Reels retention and creator ecosystem health; false negatives are also costly because Reels is a strategic surface

Meta experimentation platform supports user-level assignment, creator-level assignment, and geo holdouts; CUPED is available using 14-day pre-experiment viewer behavior

Deliverables

Define the primary metric, 2-4 guardrails, and at least one secondary metric using Meta vocabulary (for example IG Save, Reels session depth, retention, creator distribution).

Calculate the required sample size for a pre-registered MDE and convert it into expected runtime given the traffic constraints. State how CUPED changes variance and effective runtime.

Write the analysis plan: test choice, SRM checks, peeking policy, novelty effect handling, and a ship / don’t-ship rule that respects guardrails.

Call out the top pitfalls specific to this design, especially when the unit of analysis differs from the unit of randomization.

Constraints

Eligible traffic: 18M daily Reels viewers globally

Average eligible exposure: 2.4 Reels sessions per viewer per day

Current baseline IG Save rate per viewer-day: 8.0%

Maximum experiment duration: 14 days, including a 1-day ramp

Decision deadline: must recommend ship / don’t ship / iterate by day 15

False positives are costly because a bad ranker can degrade Reels retention and creator ecosystem health; false negatives are also costly because Reels is a strategic surface

Meta experimentation platform supports user-level assignment, creator-level assignment, and geo holdouts; CUPED is available using 14-day pre-experiment viewer behavior

Deliverables

Define the primary metric, 2-4 guardrails, and at least one secondary metric using Meta vocabulary (for example IG Save, Reels session depth, retention, creator distribution).

Calculate the required sample size for a pre-registered MDE and convert it into expected runtime given the traffic constraints. State how CUPED changes variance and effective runtime.

Write the analysis plan: test choice, SRM checks, peeking policy, novelty effect handling, and a ship / don’t-ship rule that respects guardrails.

Call out the top pitfalls specific to this design, especially when the unit of analysis differs from the unit of randomization.

Interview Guides

Context

Hypothesis Seed

Constraints

Deliverables

Choose Randomization for Reels Save Test

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer

Choose Randomization for Reels Save Test

Context

Hypothesis Seed

Constraints

Deliverables

Choose Randomization for Reels Save Test

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer