
You're reviewing an experiment result on a social product surface. The readout shows a statistically significant lift on the primary metric, but the effect looks small and there are questions about durability and tradeoffs on nearby metrics.
How would you decide whether a result is practically significant even if it is statistically significant?
Primary metric example: Instagram Save rate on ReelsExperiment readout may use CUPED-adjusted estimatesCheck guardrails such as Reels watch time, shares, and retentionWatch for novelty effect before recommending launch