Trustworthy Growth Experiment Design

Context

At FitPulse, a subscription fitness app, the growth team wants to test a redesigned onboarding flow that adds a personalized goal-setting step before the paywall. Leadership wants confidence that any measured lift is trustworthy enough to justify a full rollout.

Hypothesis Seed

The team believes the extra personalization will increase trial-start conversion because users will better understand the value of the app before seeing pricing. However, the added step could also increase drop-off, delay activation, or create spillovers if invited household members discuss the new flow.

Constraints

Eligible traffic: 120,000 new onboarding users per day
70% iOS, 30% Android
Maximum experiment duration: 14 days
Randomization must be decided before launch; engineering can support user-level or household-level assignment, but not both
False positives are costly because onboarding changes require legal review and app-store resubmission
False negatives are also meaningful because Q3 growth targets depend on improving paid conversion
Baseline trial-start conversion from onboarding is 24%
The smallest business-relevant lift is 4% relative

Deliverables

Define a clear null and alternative hypothesis, including whether you would use a one-sided or two-sided test.
Specify the primary metric, 2-4 guardrail metrics, and any secondary metrics. Include the unit of analysis and an explicit MDE.
Calculate the required sample size per arm and estimate whether the test can finish within 14 days given available traffic.
Choose the unit of randomization, allocation strategy, duration, and any stratification. Explain why this design makes the experiment trustworthy.
Pre-register an analysis plan: statistical test, peeking policy, multiple-comparison policy, SRM checks, and how you would handle novelty effects, network interference, or SUTVA concerns.

Your answer should end with a concrete ship / don’t-ship / iterate rule that respects guardrails, not just the primary metric.

Context

Hypothesis Seed

Constraints

Eligible traffic: 120,000 new onboarding users per day
70% iOS, 30% Android
Maximum experiment duration: 14 days
Randomization must be decided before launch; engineering can support user-level or household-level assignment, but not both
False positives are costly because onboarding changes require legal review and app-store resubmission
False negatives are also meaningful because Q3 growth targets depend on improving paid conversion
Baseline trial-start conversion from onboarding is 24%
The smallest business-relevant lift is 4% relative

Deliverables

Define a clear null and alternative hypothesis, including whether you would use a one-sided or two-sided test.
Specify the primary metric, 2-4 guardrail metrics, and any secondary metrics. Include the unit of analysis and an explicit MDE.
Calculate the required sample size per arm and estimate whether the test can finish within 14 days given available traffic.
Choose the unit of randomization, allocation strategy, duration, and any stratification. Explain why this design makes the experiment trustworthy.
Pre-register an analysis plan: statistical test, peeking policy, multiple-comparison policy, SRM checks, and how you would handle novelty effects, network interference, or SUTVA concerns.

Your answer should end with a concrete ship / don’t-ship / iterate rule that respects guardrails, not just the primary metric.

Context

Hypothesis Seed

Constraints

Eligible traffic: 120,000 new onboarding users per day
70% iOS, 30% Android
Maximum experiment duration: 14 days
Randomization must be decided before launch; engineering can support user-level or household-level assignment, but not both
False positives are costly because onboarding changes require legal review and app-store resubmission
False negatives are also meaningful because Q3 growth targets depend on improving paid conversion
Baseline trial-start conversion from onboarding is 24%
The smallest business-relevant lift is 4% relative

Deliverables

Define a clear null and alternative hypothesis, including whether you would use a one-sided or two-sided test.
Specify the primary metric, 2-4 guardrail metrics, and any secondary metrics. Include the unit of analysis and an explicit MDE.
Calculate the required sample size per arm and estimate whether the test can finish within 14 days given available traffic.
Choose the unit of randomization, allocation strategy, duration, and any stratification. Explain why this design makes the experiment trustworthy.
Pre-register an analysis plan: statistical test, peeking policy, multiple-comparison policy, SRM checks, and how you would handle novelty effects, network interference, or SUTVA concerns.

Your answer should end with a concrete ship / don’t-ship / iterate rule that respects guardrails, not just the primary metric.

Context

Hypothesis Seed

Constraints

Eligible traffic: 120,000 new onboarding users per day
70% iOS, 30% Android
Maximum experiment duration: 14 days
Randomization must be decided before launch; engineering can support user-level or household-level assignment, but not both
False positives are costly because onboarding changes require legal review and app-store resubmission
False negatives are also meaningful because Q3 growth targets depend on improving paid conversion
Baseline trial-start conversion from onboarding is 24%
The smallest business-relevant lift is 4% relative

Deliverables

Define a clear null and alternative hypothesis, including whether you would use a one-sided or two-sided test.
Specify the primary metric, 2-4 guardrail metrics, and any secondary metrics. Include the unit of analysis and an explicit MDE.
Calculate the required sample size per arm and estimate whether the test can finish within 14 days given available traffic.
Choose the unit of randomization, allocation strategy, duration, and any stratification. Explain why this design makes the experiment trustworthy.
Pre-register an analysis plan: statistical test, peeking policy, multiple-comparison policy, SRM checks, and how you would handle novelty effects, network interference, or SUTVA concerns.

Your answer should end with a concrete ship / don’t-ship / iterate rule that respects guardrails, not just the primary metric.

Interview Guides

Context

Hypothesis Seed

Constraints

Deliverables

Trustworthy Growth Experiment Design

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer

Trustworthy Growth Experiment Design

Context

Hypothesis Seed

Constraints

Deliverables

Trustworthy Growth Experiment Design

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer