Business Context
Motive tested a redesigned Driver App trip-completion flow intended to increase the share of drivers who finish setup and log their first trip. The experiment looked positive at first glance, but the Growth team wants to know whether the result is actually trustworthy and what pitfalls could make the test misleading.
Problem Statement
You are given headline A/B test results plus several experiment diagnostics. Assess whether the treatment truly improved conversion, quantify the statistical evidence, and identify the main pitfalls that could invalidate a naive conclusion.
Given Data
| Metric | Control | Treatment |
|---|
| Assigned users | 40,000 | 35,000 |
| Users who saw assigned variant | 39,200 | 31,500 |
| First-trip conversions | 4,312 | 3,780 |
| Observed conversion rate (among assigned users) | 10.78% | 10.80% |
| Observed conversion rate (among exposed users) | 11.00% | 12.00% |
| Day 1-3 conversion rate | 10.6% | 12.1% |
| Day 4-14 conversion rate | 10.8% | 10.2% |
Additional diagnostics:
- Planned traffic split was 50/50.
- The team checked results daily and almost stopped on day 3 because treatment looked significant.
- Three secondary metrics were also reviewed: activation, 7-day retention, and support-contact rate.
- Some fleets onboard drivers in batches, so drivers from the same fleet may influence each other.
- Significance level: 0.05.
Requirements
- Test whether the assigned-user conversion difference is statistically significant using a two-proportion z-test.
- Compute a 95% confidence interval for the treatment-control difference.
- Check whether the traffic allocation suggests a sample ratio mismatch.
- Explain at least four pitfalls that could make this A/B test misleading.
- State whether you would recommend rollout, rerun, or further investigation.
Assumptions
- Use assigned users as the primary intent-to-treat analysis unless otherwise noted.
- Treat first-trip conversion as a Bernoulli outcome.
- For the sample-ratio check, assume the null expectation is a 50/50 split.