Context
ConnectHub, a professional networking app, is testing a redesigned home feed ranking that surfaces more posts from second-degree connections. Leadership wants a decision within three weeks because the ranking team must either ship, iterate, or revert before the next quarterly planning cycle.
Hypothesis Seed
The team believes the new ranking will increase meaningful engagement by showing users more relevant discovery content. However, they are worried that short-term engagement gains could be misleading if they come from novelty, cannibalize messaging, or create spillovers through social interactions.
Constraints
- Eligible traffic: 1.2 million daily active users per day
- Only 60% of DAU are feed-eligible on a given day, so usable traffic is about 720,000 users/day
- Maximum experiment duration: 21 days
- Proposed allocation: 50/50 after a 1-day 5% ramp for instrumentation checks
- Baseline feed engagement rate: 28% of eligible users have at least one meaningful feed interaction per day
- Business wants to detect at least a 2.0% relative lift on the primary metric
- A false positive is costly because a bad ranking launch can degrade creator ecosystem health; a false negative is also costly but less severe
Deliverables
- Define the null and alternative hypotheses, the primary metric, 2-4 guardrail metrics, and at least one secondary metric. Be explicit about the unit of randomization and unit of analysis.
- Calculate the required sample size per arm using the stated baseline, alpha, power, and MDE. State whether the experiment can finish within 21 days.
- Propose a pre-registered experiment design and analysis plan, including the statistical test, peeking policy, and how you will handle multiple comparisons.
- Explain the main pitfalls you would watch for when interpreting results, including at least peeking, novelty effects, network interference/SUTVA concerns, and sample ratio mismatch.
- Provide a clear ship / do-not-ship / iterate rule that respects guardrails even if the primary metric is statistically significant.