Business Context
StreamHub is launching a new personalized re-engagement feature intended to improve 30-day user retention. The product team wants an experiment design that can both detect a meaningful lift and support a rollout decision.
Problem Statement
Design and analyze a randomized A/B test to measure whether the feature improves 30-day retention. Use the baseline retention and target lift below to size the experiment, then use the observed experiment results to test whether the feature worked.
Given Data
| Metric | Value |
|---|
| Baseline 30-day retention | 24.0% |
| Minimum detectable absolute lift | 1.8 percentage points |
| Significance level | 0.05 |
| Desired power | 80% |
| Control users observed | 18,400 |
| Control retained at day 30 | 4,324 |
| Treatment users observed | 18,650 |
| Treatment retained at day 30 | 4,610 |
Assume a two-sided test and equal traffic allocation during planning.
Requirements
- Define the null and alternative hypotheses for 30-day retention.
- Compute the required sample size per group to detect the stated minimum lift.
- Calculate the observed retention rates in control and treatment.
- Run a two-proportion z-test using the observed data.
- Compute a 95% confidence interval for the retention lift.
- State whether the result is statistically significant and whether it is practically meaningful.
- Briefly describe key experiment design choices: unit of randomization, primary metric, and at least two guardrails.
Assumptions
- Users are randomized at the user level and exposed consistently to one variant.
- Retention is defined as whether a user is active at least once during days 1-30 after assignment.
- No major concurrent launches materially affect retention during the test window.
- Independence between users is reasonable, and the normal approximation is valid due to large sample sizes.