Context
ShopSquare, a large e-commerce marketplace, is launching a new sponsored listing ad product that inserts one paid placement near the top of category pages. The ads team believes this will increase monetization, but the product team is concerned about harming user engagement and downstream conversion.
Hypothesis Seed
The proposed ad unit is more visually prominent than the current lightweight sponsored badge. The team expects it to increase ad revenue per session, but only if it does not materially reduce shopping quality. You need to design the experiment, define the primary metric and guardrails, and decide whether the test is feasible within the available traffic.
Constraints
- Eligible traffic: 240,000 category-page sessions per day
- Randomization can be done at the user level or session level
- Maximum experiment duration: 14 days, including a 1-day ramp
- Current baseline revenue per session from this surface: $0.120
- Historical standard deviation of revenue per session: $1.10
- Smallest business-relevant lift: +3% relative in revenue per session
- False positives are costly because a bad ad experience can reduce repeat shopping; false negatives are acceptable up to a moderate level
- The team wants one primary metric, 2-4 guardrails, and a clear ship/no-ship rule
Deliverables
- State the null and alternative hypotheses, and define one primary metric plus appropriate guardrails for this ad product.
- Calculate the required sample size using the stated baseline, variance, alpha, power, and MDE; then translate it into expected runtime under the traffic constraint.
- Choose the unit of randomization, allocation, and duration, and explain any mismatch between unit of randomization and unit of analysis.
- Pre-register an analysis plan: statistical test, peeking policy, multiple-comparison treatment, and SRM checks.
- Explain the main risks in interpreting results for this ad experiment, including novelty effects and possible interference across marketplace participants.