
You work on a streaming product and are reviewing an A/B test for a new content-discovery experience. The experiment has been running long enough to produce a directional result, but stakeholders are asking whether the readout is trustworthy.
What are the main pitfalls you watch for when interpreting an A/B test, and how do you tell whether they are affecting the result?