
You're reviewing an experiment on a social product with several success metrics and guardrails. The team is debating how to interpret mixed results across primary, secondary, and diagnostic metrics, and how to control false positives when many comparisons are made.
How would you think about multiple metrics and multiple comparisons in an experiment?
Think in terms of AARRR and product metric hierarchies, not a flat list of KPIsRepresentative metrics: Reels 7-day retention, IG Save rate, k-factor, negative feedback, crash rateRelevant pitfalls: SRM, novelty effect, and over-reading secondary metricsCUPED may improve precision using pre-experiment behavior, but it does not solve multiplicity