Size a Checkout Conversion Test

Context

ShopNow, a large e-commerce app, wants to test a simplified mobile checkout flow that removes one form step. The team must decide whether to ship it before the holiday season.

Hypothesis Seed

Product believes the shorter flow will increase completed purchases by reducing friction at checkout. However, the change could also increase payment failures or customer support contacts if users become confused.

Constraints

Eligible traffic: 120,000 mobile checkout sessions per day
Only 70% of sessions reach the payment page and are eligible for the treatment experience
Maximum experiment duration: 14 days
Allocation can be 50/50 after a small ramp
Baseline purchase conversion from eligible checkout sessions is 12.0%
The business considers a lift smaller than 0.6 percentage points absolute too small to justify engineering and support costs
A false positive is costly because a broken checkout harms revenue immediately; a false negative is acceptable if it avoids shipping a risky flow

Task

Design the experiment end to end. In your answer, cover the following:

State the null and alternative hypotheses, define the primary metric, and choose 2-4 guardrail metrics with thresholds.
Compute the required sample size per arm using an explicit MDE, significance level, and power. Then translate that into expected runtime given the traffic constraints.
Choose the unit of randomization and explain whether your unit of analysis matches it. If not, explain how you will analyze correctly.
Pre-register an analysis plan: statistical test, handling of multiple comparisons, peeking policy, and a clear ship / don’t-ship rule that respects guardrails.
Identify key pitfalls such as novelty effects, sample ratio mismatch, and any SUTVA or interference risks relevant to checkout experiments.

Be specific with assumptions and calculations. If you make a simplifying assumption, state it explicitly.

Constraints

Eligible traffic: 120,000 mobile checkout sessions per day

Only 70% of sessions reach the payment page and are eligible for the treatment experience

Maximum experiment duration: 14 days

Allocation can be 50/50 after a small ramp

Baseline purchase conversion from eligible checkout sessions is 12.0%

The business considers a lift smaller than 0.6 percentage points absolute too small to justify engineering and support costs

A false positive is costly because a broken checkout harms revenue immediately; a false negative is acceptable if it avoids shipping a risky flow

Task

Design the experiment end to end. In your answer, cover the following:

State the null and alternative hypotheses, define the primary metric, and choose 2-4 guardrail metrics with thresholds.

Compute the required sample size per arm using an explicit MDE, significance level, and power. Then translate that into expected runtime given the traffic constraints.

Choose the unit of randomization and explain whether your unit of analysis matches it. If not, explain how you will analyze correctly.

Pre-register an analysis plan: statistical test, handling of multiple comparisons, peeking policy, and a clear ship / don’t-ship rule that respects guardrails.

Identify key pitfalls such as novelty effects, sample ratio mismatch, and any SUTVA or interference risks relevant to checkout experiments.

Be specific with assumptions and calculations. If you make a simplifying assumption, state it explicitly.

Constraints

Eligible traffic: 120,000 mobile checkout sessions per day

Only 70% of sessions reach the payment page and are eligible for the treatment experience

Maximum experiment duration: 14 days

Allocation can be 50/50 after a small ramp

Baseline purchase conversion from eligible checkout sessions is 12.0%

The business considers a lift smaller than 0.6 percentage points absolute too small to justify engineering and support costs

A false positive is costly because a broken checkout harms revenue immediately; a false negative is acceptable if it avoids shipping a risky flow

Task

Design the experiment end to end. In your answer, cover the following:

State the null and alternative hypotheses, define the primary metric, and choose 2-4 guardrail metrics with thresholds.

Compute the required sample size per arm using an explicit MDE, significance level, and power. Then translate that into expected runtime given the traffic constraints.

Choose the unit of randomization and explain whether your unit of analysis matches it. If not, explain how you will analyze correctly.

Pre-register an analysis plan: statistical test, handling of multiple comparisons, peeking policy, and a clear ship / don’t-ship rule that respects guardrails.

Identify key pitfalls such as novelty effects, sample ratio mismatch, and any SUTVA or interference risks relevant to checkout experiments.

Be specific with assumptions and calculations. If you make a simplifying assumption, state it explicitly.

Constraints

Eligible traffic: 120,000 mobile checkout sessions per day

Only 70% of sessions reach the payment page and are eligible for the treatment experience

Maximum experiment duration: 14 days

Allocation can be 50/50 after a small ramp

Baseline purchase conversion from eligible checkout sessions is 12.0%

The business considers a lift smaller than 0.6 percentage points absolute too small to justify engineering and support costs

A false positive is costly because a broken checkout harms revenue immediately; a false negative is acceptable if it avoids shipping a risky flow

Task

Design the experiment end to end. In your answer, cover the following:

State the null and alternative hypotheses, define the primary metric, and choose 2-4 guardrail metrics with thresholds.

Compute the required sample size per arm using an explicit MDE, significance level, and power. Then translate that into expected runtime given the traffic constraints.

Choose the unit of randomization and explain whether your unit of analysis matches it. If not, explain how you will analyze correctly.

Pre-register an analysis plan: statistical test, handling of multiple comparisons, peeking policy, and a clear ship / don’t-ship rule that respects guardrails.

Identify key pitfalls such as novelty effects, sample ratio mismatch, and any SUTVA or interference risks relevant to checkout experiments.

Be specific with assumptions and calculations. If you make a simplifying assumption, state it explicitly.

Interview Guides

Context

Hypothesis Seed

Constraints

Task

Size a Checkout Conversion Test

Context

Hypothesis Seed

Constraints

Task

Your Answer

Size a Checkout Conversion Test

Context

Hypothesis Seed

Constraints

Task

Size a Checkout Conversion Test

Context

Hypothesis Seed

Constraints

Task

Your Answer