Context
ShopNow, a mid-sized e-commerce app, wants to test a simplified mobile checkout flow that removes one confirmation step. The product manager believes this will increase completed purchases, but engineering wants a clear answer within a fixed launch window.
Hypothesis Seed
The proposed change reduces friction in checkout, so the team expects a modest lift in purchase conversion among users who start checkout. Because the change touches payment UX, the team is also concerned about accidental purchases, payment failures, and support contacts.
Constraints
- Eligible traffic: 120,000 mobile users per day who start checkout
- Randomization can only be done at the
user_id level
- Maximum experiment duration: 14 days, including ramp
- Planned allocation after ramp: 50/50
- Baseline checkout completion rate: 24%
- Business wants to detect at least a 5% relative lift in checkout completion
- False positives are costly because a bad checkout experience can harm trust and payment success; false negatives are acceptable if the effect is too small to matter operationally
- You may assume a two-sided test with = 0.05 and power = 80%
Deliverables
- State the null and alternative hypotheses, define the primary metric, and propose 2-4 guardrail metrics.
- Calculate the required sample size per arm using the stated baseline and MDE, and determine whether the test can be completed within 14 days.
- Choose the experiment design: unit of randomization, allocation/ramp, duration, and any stratification or blocking.
- Pre-register an analysis plan covering the statistical test, peeking policy, multiple comparisons treatment, and how you would check for sample ratio mismatch.
- Explain the ship / don't-ship rule, including what happens if the primary metric is significant but a guardrail worsens or the observed lift is below the planned MDE.