Measure Novelty in Feature Launch

Context

PulseChat, a consumer messaging app, is testing a new feature called Quick Reactions that lets users respond to messages with one-tap animated reactions. Product leadership wants to know whether the feature creates durable engagement or just a short-lived novelty spike.

Hypothesis Seed

The team believes Quick Reactions will increase meaningful conversation engagement by making lightweight responses easier. However, they are explicitly worried that users may overuse the feature in the first few days because it is new and visually salient, with the effect fading after novelty wears off.

Constraints

Eligible traffic: 240,000 daily active users per day
80% of DAU are eligible for the experiment after app-version filtering
Maximum decision window: 21 days
Randomization must be user-level because the feature is persistent in the UI
A false positive is costly: shipping a novelty-only feature adds long-term UI clutter and engineering maintenance
A false negative is acceptable if the true lift is very small
The team wants at least 80% power at a 5% significance level

Task

Define the null and alternative hypotheses, making clear how you will distinguish a novelty effect from a durable treatment effect.
Choose a primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Calculate the required sample size and show whether the experiment can be completed within 21 days given available traffic.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification. Explain how your design helps detect novelty effects rather than just average lift.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparison policy, and how you will make the final ship / don’t-ship decision if week 1 is positive but week 3 is flat or negative.

Be concrete: use real numbers, not placeholders. Your answer should explicitly address novelty effects as a first-class risk, not just mention them in passing.

Context

Hypothesis Seed

Constraints

Eligible traffic: 240,000 daily active users per day
80% of DAU are eligible for the experiment after app-version filtering
Maximum decision window: 21 days
Randomization must be user-level because the feature is persistent in the UI
A false positive is costly: shipping a novelty-only feature adds long-term UI clutter and engineering maintenance
A false negative is acceptable if the true lift is very small
The team wants at least 80% power at a 5% significance level

Task

Define the null and alternative hypotheses, making clear how you will distinguish a novelty effect from a durable treatment effect.
Choose a primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Calculate the required sample size and show whether the experiment can be completed within 21 days given available traffic.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification. Explain how your design helps detect novelty effects rather than just average lift.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparison policy, and how you will make the final ship / don’t-ship decision if week 1 is positive but week 3 is flat or negative.

Be concrete: use real numbers, not placeholders. Your answer should explicitly address novelty effects as a first-class risk, not just mention them in passing.

Context

Hypothesis Seed

Constraints

Eligible traffic: 240,000 daily active users per day
80% of DAU are eligible for the experiment after app-version filtering
Maximum decision window: 21 days
Randomization must be user-level because the feature is persistent in the UI
A false positive is costly: shipping a novelty-only feature adds long-term UI clutter and engineering maintenance
A false negative is acceptable if the true lift is very small
The team wants at least 80% power at a 5% significance level

Task

Define the null and alternative hypotheses, making clear how you will distinguish a novelty effect from a durable treatment effect.
Choose a primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Calculate the required sample size and show whether the experiment can be completed within 21 days given available traffic.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification. Explain how your design helps detect novelty effects rather than just average lift.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparison policy, and how you will make the final ship / don’t-ship decision if week 1 is positive but week 3 is flat or negative.

Be concrete: use real numbers, not placeholders. Your answer should explicitly address novelty effects as a first-class risk, not just mention them in passing.

Context

Hypothesis Seed

Constraints

Eligible traffic: 240,000 daily active users per day
80% of DAU are eligible for the experiment after app-version filtering
Maximum decision window: 21 days
Randomization must be user-level because the feature is persistent in the UI
A false positive is costly: shipping a novelty-only feature adds long-term UI clutter and engineering maintenance
A false negative is acceptable if the true lift is very small
The team wants at least 80% power at a 5% significance level

Task

Define the null and alternative hypotheses, making clear how you will distinguish a novelty effect from a durable treatment effect.
Choose a primary metric, 2-4 guardrails, and at least one secondary metric. State the baseline and an explicit MDE.
Calculate the required sample size and show whether the experiment can be completed within 21 days given available traffic.
Propose the experiment design: unit of randomization, allocation, duration, and any stratification. Explain how your design helps detect novelty effects rather than just average lift.
Pre-register the analysis plan, including the statistical test, peeking policy, multiple-comparison policy, and how you will make the final ship / don’t-ship decision if week 1 is positive but week 3 is flat or negative.

Be concrete: use real numbers, not placeholders. Your answer should explicitly address novelty effects as a first-class risk, not just mention them in passing.

Interview Guides

Context

Hypothesis Seed

Constraints

Task

Measure Novelty in Feature Launch

Context

Hypothesis Seed

Constraints

Task

Your Answer

Measure Novelty in Feature Launch

Context

Hypothesis Seed

Constraints

Task

Measure Novelty in Feature Launch

Context

Hypothesis Seed

Constraints

Task

Your Answer