Context
Affirm is testing a redesigned monthly payment messaging module on the merchant checkout page. The growth team believes clearer installment messaging will increase completed applications, but they want a rigorous experiment plan that explicitly accounts for common analysis pitfalls.
Hypothesis Seed
The treatment replaces the current generic payment copy with a more prominent Affirm-specific message showing estimated monthly payments and a stronger CTA. The team expects this to increase the rate at which eligible checkout visitors start and complete an Affirm application, without increasing credit-risk proxies or causing downstream repayment-quality issues.
Constraints
- Eligible traffic: 180,000 checkout visitors per day across participating merchants
- Only 60% of visitors are shown an Affirm financing option and are eligible for the experiment
- Maximum experiment runtime: 14 days before the merchant-launch calendar forces a decision
- Randomization must happen in a way that avoids users seeing both variants across repeated visits when possible
- A false positive is costly because it could push a worse checkout experience to many merchants; a false negative is also costly because the Q3 growth target depends on improving Affirm checkout conversion
Deliverables
- Define the experiment hypothesis, the primary metric, and 2-4 guardrail metrics. State the baseline and an explicit minimum detectable effect (MDE).
- Calculate the required sample size per arm and determine whether the test can be completed within 14 days given the available eligible traffic.
- Choose the unit of randomization, allocation, duration, and any stratification. Explain how your design handles repeat visitors and merchant heterogeneity.
- Pre-register an analysis plan: statistical test, treatment of secondary metrics, peeking policy, and how you will diagnose or mitigate pitfalls such as novelty effects, network interference, SUTVA violations, and sample ratio mismatch.
- State a clear ship / don’t ship / iterate rule that respects both the primary metric and guardrails.
Assume the current baseline checkout-to-Affirm application start rate among eligible visitors is 12.0%. Use α = 0.05, 80% power, and target an MDE of 5% relative lift unless you justify a different choice.