You’re a data scientist at CartJet, a high-volume e-commerce marketplace (~8M weekly active users) that is redesigning its checkout page to reduce friction and increase completed purchases. The product team ran a 10-day A/B test and is excited because the treatment shows a higher conversion rate.
In the launch meeting, a PM says: “The p-value is 0.03, so there’s a 97% chance the new checkout is better.” Another stakeholder says: “A p-value of 0.03 means only 3% of the observed lift is due to randomness.” You need to correct the interpretation and make a recommendation that’s statistically sound and business-relevant.
Using the experiment results below, compute the p-value and then explain what the p-value does and does not mean in this hypothesis test. Finally, connect the statistical result to a rollout decision, including at least one caveat.
| Item | Control (A) | Treatment (B) |
|---|---|---|
| Users exposed (n) | 120,000 | 120,000 |
| Purchases (x) | 7,800 | 8,160 |
| Observed conversion rate (x/n) | 6.50% | 6.80% |
| Significance level (α) | - | 0.05 |
Notes: Traffic split was 50/50 by user_id hash. A user is counted once (first exposure only).