You work on a cloud communications product and want to test a new inbound call-routing policy in the live call queue. The team believes the policy will reduce missed calls and slightly improve answer rate by matching callers to available agents more efficiently. A standard user-level A/B test is risky because agents, queue load, and caller wait times interact across all calls in the same time window. You are considering a switchback test where the routing policy alternates by time block instead of assigning treatment at the individual call level.
How would you design this experiment, including when a switchback test is preferable to a standard A/B test, and how would you define the metrics, power the test, analyze the results, and decide whether to ship?