Diagnose a Failed Onboarding Test

Business Context

StreamCart, a subscription video platform, tested a redesigned signup onboarding flow intended to increase paid-start conversion. The product manager called the experiment a failure because the primary metric was not significant after 10 days, despite an apparent lift in the dashboard.

Problem Statement

Analyze whether the A/B test truly failed statistically, and explain what should be done next.

Given Data

Group	Users Exposed	Paid Starts	Conversion Rate
Control (old onboarding)	52,400	6,026	11.50%
Treatment (new onboarding)	51,900	6,179	11.91%

Additional context:

Metric	Value
Significance level	0.05
Test type	Two-sided
Planned minimum detectable effect	0.80 percentage points
Historical baseline conversion	11.5%
Planned power	80%

Requirements

State the null and alternative hypotheses for the conversion-rate comparison.
Compute the observed absolute lift and relative lift.
Run a two-proportion z-test using the pooled standard error.
Calculate the two-sided p-value and a 95% confidence interval for the treatment-control difference.
Decide whether the result is statistically significant at $\alpha = 0.05$ .
Explain why the test may be considered a "failed" experiment even if treatment is directionally positive.
Recommend what the team should do afterward: ship, stop, rerun, or redesign the experiment.

Assumptions

Users were randomly assigned at the user level.
Each exposed user is counted once.
No sample-ratio mismatch or instrumentation bug is present.
The normal approximation is appropriate because both groups have large sample sizes and many conversions/non-conversions.

Business Context

Problem Statement

Analyze whether the A/B test truly failed statistically, and explain what should be done next.

Given Data

Group	Users Exposed	Paid Starts	Conversion Rate
Control (old onboarding)	52,400	6,026	11.50%
Treatment (new onboarding)	51,900	6,179	11.91%

Additional context:

Metric	Value
Significance level	0.05
Test type	Two-sided
Planned minimum detectable effect	0.80 percentage points
Historical baseline conversion	11.5%
Planned power	80%

Requirements

State the null and alternative hypotheses for the conversion-rate comparison.
Compute the observed absolute lift and relative lift.
Run a two-proportion z-test using the pooled standard error.
Calculate the two-sided p-value and a 95% confidence interval for the treatment-control difference.
Decide whether the result is statistically significant at $\alpha = 0.05$ .
Explain why the test may be considered a "failed" experiment even if treatment is directionally positive.
Recommend what the team should do afterward: ship, stop, rerun, or redesign the experiment.

Assumptions

Users were randomly assigned at the user level.
Each exposed user is counted once.
No sample-ratio mismatch or instrumentation bug is present.
The normal approximation is appropriate because both groups have large sample sizes and many conversions/non-conversions.

Business Context

Problem Statement

Analyze whether the A/B test truly failed statistically, and explain what should be done next.

Given Data

Group	Users Exposed	Paid Starts	Conversion Rate
Control (old onboarding)	52,400	6,026	11.50%
Treatment (new onboarding)	51,900	6,179	11.91%

Additional context:

Metric	Value
Significance level	0.05
Test type	Two-sided
Planned minimum detectable effect	0.80 percentage points
Historical baseline conversion	11.5%
Planned power	80%

Requirements

State the null and alternative hypotheses for the conversion-rate comparison.
Compute the observed absolute lift and relative lift.
Run a two-proportion z-test using the pooled standard error.
Calculate the two-sided p-value and a 95% confidence interval for the treatment-control difference.
Decide whether the result is statistically significant at $\alpha = 0.05$ .
Explain why the test may be considered a "failed" experiment even if treatment is directionally positive.
Recommend what the team should do afterward: ship, stop, rerun, or redesign the experiment.

Assumptions

Users were randomly assigned at the user level.
Each exposed user is counted once.
No sample-ratio mismatch or instrumentation bug is present.
The normal approximation is appropriate because both groups have large sample sizes and many conversions/non-conversions.

Business Context

Problem Statement

Analyze whether the A/B test truly failed statistically, and explain what should be done next.

Given Data

Group	Users Exposed	Paid Starts	Conversion Rate
Control (old onboarding)	52,400	6,026	11.50%
Treatment (new onboarding)	51,900	6,179	11.91%

Additional context:

Metric	Value
Significance level	0.05
Test type	Two-sided
Planned minimum detectable effect	0.80 percentage points
Historical baseline conversion	11.5%
Planned power	80%

Requirements

State the null and alternative hypotheses for the conversion-rate comparison.
Compute the observed absolute lift and relative lift.
Run a two-proportion z-test using the pooled standard error.
Calculate the two-sided p-value and a 95% confidence interval for the treatment-control difference.
Decide whether the result is statistically significant at $\alpha = 0.05$ .
Explain why the test may be considered a "failed" experiment even if treatment is directionally positive.
Recommend what the team should do afterward: ship, stop, rerun, or redesign the experiment.

Assumptions

Users were randomly assigned at the user level.
Each exposed user is counted once.
No sample-ratio mismatch or instrumentation bug is present.
The normal approximation is appropriate because both groups have large sample sizes and many conversions/non-conversions.

Interview Guides

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Diagnose a Failed Onboarding Test

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer

Diagnose a Failed Onboarding Test

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Diagnose a Failed Onboarding Test

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer