Diagnose SRM in Checkout Test

Context

ShopNow is testing a simplified mobile checkout flow intended to increase completed purchases. Midway through the experiment, the team notices that treatment is receiving materially less traffic than expected and asks whether the experiment is still trustworthy.

Hypothesis Seed

The new checkout removes one confirmation step and auto-fills saved shipping information. Product expects a modest lift in purchase conversion, but an allocation bug or logging issue could create sample ratio mismatch (SRM), making any treatment effect estimate invalid.

Constraints

Eligible traffic: 240,000 mobile checkout starters per day
Planned allocation: 50/50 control vs treatment
Maximum runtime: 14 days
Baseline purchase conversion from checkout start: 24%
Smallest business-relevant lift: 2% relative
False positives are costly because a broken checkout harms revenue immediately; false negatives are acceptable if they only delay launch by one sprint
The team wants daily monitoring for SRM, but no repeated significance testing on the primary metric before the pre-registered readout

Deliverables

State the null and alternative hypotheses for both the product effect and the SRM diagnostic, and explain how you would identify SRM in the experiment dataset.
Define the primary metric, 2-4 guardrails, and at least one secondary metric. Include the unit of randomization and unit of analysis.
Calculate the required sample size for the primary metric using explicit assumptions for baseline, MDE, alpha, and power, then translate that into expected runtime given available traffic.
Pre-register an analysis plan covering the statistical test, peeking policy, multiple-comparison treatment, and what happens if SRM is detected at any point.
Give a clear ship / don’t-ship / investigate decision rule that respects guardrails and explains why SRM can invalidate otherwise significant results.

Constraints

Eligible traffic: 240,000 mobile checkout starters per day

Planned allocation: 50/50 control vs treatment

Maximum runtime: 14 days

Baseline purchase conversion from checkout start: 24%

Smallest business-relevant lift: 2% relative

False positives are costly because a broken checkout harms revenue immediately; false negatives are acceptable if they only delay launch by one sprint

The team wants daily monitoring for SRM, but no repeated significance testing on the primary metric before the pre-registered readout

Deliverables

State the null and alternative hypotheses for both the product effect and the SRM diagnostic, and explain how you would identify SRM in the experiment dataset.

Define the primary metric, 2-4 guardrails, and at least one secondary metric. Include the unit of randomization and unit of analysis.

Calculate the required sample size for the primary metric using explicit assumptions for baseline, MDE, alpha, and power, then translate that into expected runtime given available traffic.

Pre-register an analysis plan covering the statistical test, peeking policy, multiple-comparison treatment, and what happens if SRM is detected at any point.

Give a clear ship / don’t-ship / investigate decision rule that respects guardrails and explains why SRM can invalidate otherwise significant results.

Constraints

Eligible traffic: 240,000 mobile checkout starters per day

Planned allocation: 50/50 control vs treatment

Maximum runtime: 14 days

Baseline purchase conversion from checkout start: 24%

Smallest business-relevant lift: 2% relative

False positives are costly because a broken checkout harms revenue immediately; false negatives are acceptable if they only delay launch by one sprint

The team wants daily monitoring for SRM, but no repeated significance testing on the primary metric before the pre-registered readout

Deliverables

State the null and alternative hypotheses for both the product effect and the SRM diagnostic, and explain how you would identify SRM in the experiment dataset.

Define the primary metric, 2-4 guardrails, and at least one secondary metric. Include the unit of randomization and unit of analysis.

Calculate the required sample size for the primary metric using explicit assumptions for baseline, MDE, alpha, and power, then translate that into expected runtime given available traffic.

Pre-register an analysis plan covering the statistical test, peeking policy, multiple-comparison treatment, and what happens if SRM is detected at any point.

Give a clear ship / don’t-ship / investigate decision rule that respects guardrails and explains why SRM can invalidate otherwise significant results.

Constraints

Eligible traffic: 240,000 mobile checkout starters per day

Planned allocation: 50/50 control vs treatment

Maximum runtime: 14 days

Baseline purchase conversion from checkout start: 24%

Smallest business-relevant lift: 2% relative

False positives are costly because a broken checkout harms revenue immediately; false negatives are acceptable if they only delay launch by one sprint

The team wants daily monitoring for SRM, but no repeated significance testing on the primary metric before the pre-registered readout

Deliverables

State the null and alternative hypotheses for both the product effect and the SRM diagnostic, and explain how you would identify SRM in the experiment dataset.

Define the primary metric, 2-4 guardrails, and at least one secondary metric. Include the unit of randomization and unit of analysis.

Calculate the required sample size for the primary metric using explicit assumptions for baseline, MDE, alpha, and power, then translate that into expected runtime given available traffic.

Pre-register an analysis plan covering the statistical test, peeking policy, multiple-comparison treatment, and what happens if SRM is detected at any point.

Give a clear ship / don’t-ship / investigate decision rule that respects guardrails and explains why SRM can invalidate otherwise significant results.

Interview Guides

Context

Hypothesis Seed

Constraints

Deliverables

Diagnose SRM in Checkout Test

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer

Diagnose SRM in Checkout Test

Context

Hypothesis Seed

Constraints

Deliverables

Diagnose SRM in Checkout Test

Context

Hypothesis Seed

Constraints

Deliverables

Your Answer