Retention Lift Experiment for New Feature

Business Context

StreamHub, a subscription video app, launched a new personalized reminder feature. Product leadership wants to know whether it improves true user retention rather than only creating a short-lived spike in opens or clicks.

Problem Statement

Design and analyze an experiment that tests whether the feature increases 28-day retention. Use 7-day engagement only as a diagnostic metric, not the decision metric.

Given Data

A 6-week randomized controlled experiment was run on newly eligible users.

Group	Users Assigned	7-Day Engaged Users	7-Day Engagement Rate	28-Day Retained Users	28-Day Retention Rate
Control	24,800	10,912	44.0%	8,184	33.0%
Treatment	24,950	11,726	47.0%	8,857	35.5%

Additional design inputs:

Parameter	Value
Significance level	0.05
Power target	0.80
Minimum detectable effect on 28-day retention	1.5 percentage points
Baseline 28-day retention assumption	33.0%

Requirements

Define the primary metric, null hypothesis, and alternative hypothesis.
Explain why 28-day retention should be the primary success metric instead of 7-day engagement.
Test whether the observed 28-day retention lift is statistically significant using a two-proportion z-test.
Compute a 95% confidence interval for the retention lift.
Estimate the required sample size per group for detecting a 1.5 percentage point lift at 80% power.
State whether StreamHub should roll out the feature now, and list at least three experimental design safeguards to ensure the effect is causal.

Assumptions

User-level randomization was implemented correctly.
Each user appears once and outcomes are independent across users.
No major concurrent launches affected retention during the test window.
Retention is defined as at least one active session during days 22-28 after assignment.

Business Context

Problem Statement

Design and analyze an experiment that tests whether the feature increases 28-day retention. Use 7-day engagement only as a diagnostic metric, not the decision metric.

Given Data

A 6-week randomized controlled experiment was run on newly eligible users.

Group	Users Assigned	7-Day Engaged Users	7-Day Engagement Rate	28-Day Retained Users	28-Day Retention Rate
Control	24,800	10,912	44.0%	8,184	33.0%
Treatment	24,950	11,726	47.0%	8,857	35.5%

Additional design inputs:

Parameter	Value
Significance level	0.05
Power target	0.80
Minimum detectable effect on 28-day retention	1.5 percentage points
Baseline 28-day retention assumption	33.0%

Requirements

Define the primary metric, null hypothesis, and alternative hypothesis.
Explain why 28-day retention should be the primary success metric instead of 7-day engagement.
Test whether the observed 28-day retention lift is statistically significant using a two-proportion z-test.
Compute a 95% confidence interval for the retention lift.
Estimate the required sample size per group for detecting a 1.5 percentage point lift at 80% power.
State whether StreamHub should roll out the feature now, and list at least three experimental design safeguards to ensure the effect is causal.

Assumptions

User-level randomization was implemented correctly.
Each user appears once and outcomes are independent across users.
No major concurrent launches affected retention during the test window.
Retention is defined as at least one active session during days 22-28 after assignment.

Business Context

Problem Statement

Design and analyze an experiment that tests whether the feature increases 28-day retention. Use 7-day engagement only as a diagnostic metric, not the decision metric.

Given Data

A 6-week randomized controlled experiment was run on newly eligible users.

Group	Users Assigned	7-Day Engaged Users	7-Day Engagement Rate	28-Day Retained Users	28-Day Retention Rate
Control	24,800	10,912	44.0%	8,184	33.0%
Treatment	24,950	11,726	47.0%	8,857	35.5%

Additional design inputs:

Parameter	Value
Significance level	0.05
Power target	0.80
Minimum detectable effect on 28-day retention	1.5 percentage points
Baseline 28-day retention assumption	33.0%

Requirements

Define the primary metric, null hypothesis, and alternative hypothesis.
Explain why 28-day retention should be the primary success metric instead of 7-day engagement.
Test whether the observed 28-day retention lift is statistically significant using a two-proportion z-test.
Compute a 95% confidence interval for the retention lift.
Estimate the required sample size per group for detecting a 1.5 percentage point lift at 80% power.
State whether StreamHub should roll out the feature now, and list at least three experimental design safeguards to ensure the effect is causal.

Assumptions

User-level randomization was implemented correctly.
Each user appears once and outcomes are independent across users.
No major concurrent launches affected retention during the test window.
Retention is defined as at least one active session during days 22-28 after assignment.

Business Context

Problem Statement

Design and analyze an experiment that tests whether the feature increases 28-day retention. Use 7-day engagement only as a diagnostic metric, not the decision metric.

Given Data

A 6-week randomized controlled experiment was run on newly eligible users.

Group	Users Assigned	7-Day Engaged Users	7-Day Engagement Rate	28-Day Retained Users	28-Day Retention Rate
Control	24,800	10,912	44.0%	8,184	33.0%
Treatment	24,950	11,726	47.0%	8,857	35.5%

Additional design inputs:

Parameter	Value
Significance level	0.05
Power target	0.80
Minimum detectable effect on 28-day retention	1.5 percentage points
Baseline 28-day retention assumption	33.0%

Requirements

Define the primary metric, null hypothesis, and alternative hypothesis.
Explain why 28-day retention should be the primary success metric instead of 7-day engagement.
Test whether the observed 28-day retention lift is statistically significant using a two-proportion z-test.
Compute a 95% confidence interval for the retention lift.
Estimate the required sample size per group for detecting a 1.5 percentage point lift at 80% power.
State whether StreamHub should roll out the feature now, and list at least three experimental design safeguards to ensure the effect is causal.

Assumptions

User-level randomization was implemented correctly.
Each user appears once and outcomes are independent across users.
No major concurrent launches affected retention during the test window.
Retention is defined as at least one active session during days 22-28 after assignment.

Interview Guides

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Retention Lift Experiment for New Feature

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer

Retention Lift Experiment for New Feature

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Retention Lift Experiment for New Feature

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer