Decide Whether to Rerun

Context

StreamCart, a grocery delivery app, recently tested a redesigned checkout page intended to reduce friction and increase completed orders. The first experiment finished without a statistically significant result, and leadership is asking whether the team should rerun it or move on.

Hypothesis Seed

The redesign shortens the checkout flow from 4 steps to 3 and surfaces saved payment methods earlier. Product believes this should improve checkout completion, but the prior test showed a small positive point estimate with wide confidence intervals and no clear decision.

Constraints

Eligible traffic: 180,000 checkout starts per day
Maximum additional runtime if rerun: 14 days
Prior baseline checkout completion rate: 48%
Small engineering cost to rerun, but a false positive is expensive because a worse checkout experience can reduce revenue and trust
The team wants a decision framework for when an inconclusive result justifies a rerun versus when it indicates the test was underpowered, poorly designed, or simply not impactful enough

Task

Define the null and alternative hypotheses, the primary metric, 2-4 guardrails, and an explicit MDE that would justify shipping.
Calculate the required sample size and duration for a rerun using the stated traffic, showing the math and explaining whether the prior inconclusive result should trigger a rerun.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification or variance-reduction choices.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparisons policy, and how you would diagnose issues such as sample ratio mismatch.
Give a clear ship / don’t ship / rerun decision rule. Explain what you would do if the rerun is statistically significant but the observed lift is smaller than the MDE, or if the primary metric improves while guardrails worsen.

Problem

Context

Hypothesis Seed

Constraints

Eligible traffic: 180,000 checkout starts per day
Maximum additional runtime if rerun: 14 days
Prior baseline checkout completion rate: 48%
Small engineering cost to rerun, but a false positive is expensive because a worse checkout experience can reduce revenue and trust
The team wants a decision framework for when an inconclusive result justifies a rerun versus when it indicates the test was underpowered, poorly designed, or simply not impactful enough

Task

Define the null and alternative hypotheses, the primary metric, 2-4 guardrails, and an explicit MDE that would justify shipping.
Calculate the required sample size and duration for a rerun using the stated traffic, showing the math and explaining whether the prior inconclusive result should trigger a rerun.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification or variance-reduction choices.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparisons policy, and how you would diagnose issues such as sample ratio mismatch.
Give a clear ship / don’t ship / rerun decision rule. Explain what you would do if the rerun is statistically significant but the observed lift is smaller than the MDE, or if the primary metric improves while guardrails worsen.

Problem

Context

Hypothesis Seed

Constraints

Eligible traffic: 180,000 checkout starts per day
Maximum additional runtime if rerun: 14 days
Prior baseline checkout completion rate: 48%
Small engineering cost to rerun, but a false positive is expensive because a worse checkout experience can reduce revenue and trust
The team wants a decision framework for when an inconclusive result justifies a rerun versus when it indicates the test was underpowered, poorly designed, or simply not impactful enough

Task

Define the null and alternative hypotheses, the primary metric, 2-4 guardrails, and an explicit MDE that would justify shipping.
Calculate the required sample size and duration for a rerun using the stated traffic, showing the math and explaining whether the prior inconclusive result should trigger a rerun.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification or variance-reduction choices.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparisons policy, and how you would diagnose issues such as sample ratio mismatch.
Give a clear ship / don’t ship / rerun decision rule. Explain what you would do if the rerun is statistically significant but the observed lift is smaller than the MDE, or if the primary metric improves while guardrails worsen.

Problem

Context

Hypothesis Seed

Constraints

Eligible traffic: 180,000 checkout starts per day
Maximum additional runtime if rerun: 14 days
Prior baseline checkout completion rate: 48%
Small engineering cost to rerun, but a false positive is expensive because a worse checkout experience can reduce revenue and trust
The team wants a decision framework for when an inconclusive result justifies a rerun versus when it indicates the test was underpowered, poorly designed, or simply not impactful enough

Task

Define the null and alternative hypotheses, the primary metric, 2-4 guardrails, and an explicit MDE that would justify shipping.
Calculate the required sample size and duration for a rerun using the stated traffic, showing the math and explaining whether the prior inconclusive result should trigger a rerun.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification or variance-reduction choices.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparisons policy, and how you would diagnose issues such as sample ratio mismatch.
Give a clear ship / don’t ship / rerun decision rule. Explain what you would do if the rerun is statistically significant but the observed lift is smaller than the MDE, or if the primary metric improves while guardrails worsen.

Interview Guides

Problem

Context

Hypothesis Seed

Constraints

Task

Problem

Context

Hypothesis Seed

Constraints

Task

Decide Whether to Rerun

Problem

Context

Hypothesis Seed

Constraints

Task

Problem

Context

Hypothesis Seed

Constraints

Task