You work on a music-creation platform and want to test a redesigned sample-discovery experience that makes new sounds more visually prominent in the browse surface. The team believes the new design will increase the rate at which users preview and save samples, but worries that any early lift may come from curiosity rather than sustained value. You need to design an experiment that can separate a true product improvement from a short-lived novelty effect.
How would you design and analyze this experiment so that you can detect a meaningful lift while accounting for novelty effect risk, and what would your ship decision rule be if early results are strong but decay over time?