You are planning an A/B test for a mobile game feature and need to decide how success will be measured. The feature may improve short-term engagement, but there is concern it could also hurt longer-term player experience or monetization in ways that would be missed if you optimize for only one outcome.
How do you decide which metric should be the primary metric and which metrics should be guardrails in this experiment? Walk through how you would make that choice, how it affects experiment design, and how you would interpret the results if the primary metric improves but a guardrail worsens.