Context
FitLoop, a fitness subscription app, wants to redesign its referral invite flow to drive user growth. The growth team believes a simpler invite screen with pre-filled contact suggestions will increase successful referrals, but they are debating which metric should be the primary success metric.
Hypothesis Seed
The new referral flow reduces friction at invite time, so more exposed users should generate successful referred sign-ups. However, optimizing a shallow metric like invite-button clicks could create spammy behavior or low-quality referrals that do not activate.
Constraints
- Eligible traffic: 120,000 existing active users per day can see the referral flow
- Average baseline successful referral rate: 8.0% of exposed users generate at least one referred sign-up within 7 days
- Maximum experiment runtime: 21 days, including a 2-day ramp
- Business cost asymmetry: a false positive is expensive because spammy invites can hurt brand trust and paid acquisition efficiency; a false negative is acceptable if the team can iterate next sprint
- The team needs a decision by the end of the month and cannot run more than one confirmatory experiment before launch planning
Deliverables
- Define the primary success metric, explain why it is better than obvious alternatives (e.g., invite clicks or raw invite sends), and specify 2-4 guardrail metrics.
- State the null and alternative hypotheses, choose an MDE, and calculate the required sample size and expected duration using the provided traffic and baseline.
- Choose the unit of randomization and explain whether the unit of analysis differs from it.
- Pre-register an analysis plan: statistical test, peeking policy, multiple-comparisons policy, and how you will handle sample ratio mismatch or interference risks.
- Give a clear ship / do-not-ship / iterate rule that respects both the primary metric and guardrails.
Be explicit about why your chosen primary metric best captures durable growth rather than superficial engagement in the referral funnel.