You are testing an operational change in a delivery marketplace, such as a new dispatch or batching policy, to improve efficiency. The change may help the core outcome you care about, but it could also create unintended side effects for customers, dashers, or merchants.
What guardrail metrics would you use when testing this operational change, and how would you decide whether the experiment is safe to ship if the primary metric improves?