You work on a consumer fintech product and your team has built a new growth feature that could affect both directly exposed users and users they interact with. The team is debating whether to launch it to everyone except for a small holdout group, or to run a standard A/B test, because the feature may have persistent and spillover effects.
How would you decide whether to use a holdout group for this growth feature? Walk through what would make a holdout necessary, what metrics and tradeoffs you would evaluate, and how you would design the experiment so the result is credible.