Bayesian vs Frequentist Checkout Lift

Business Context

CartJet is a high-scale e-commerce marketplace (~18M monthly active users) running an A/B test on a redesigned checkout intended to reduce friction and increase purchase conversion. Because checkout changes are revenue-critical and can create subtle regressions (e.g., payment failures), the experiment is launched at 50/50 traffic for 10 days.

The VP of Product asks for a decision memo that explains results in both frequentist and Bayesian terms, since different stakeholders interpret “significance” differently. You’re asked to quantify the evidence and explicitly highlight what changes between the two paradigms (parameters as fixed vs random, interpretation of intervals, decision criteria).

Given Data

Group	Users (n)	Purchases (x)	Observed conversion
Control (old checkout)	182,450	23,530	12.90%
Treatment (new checkout)	181,980	24,150	13.27%

Additional decision inputs:

Significance level for the frequentist test: $\alpha = 0.05$ (two-sided)
Business threshold: ship only if the conversion lift is > 0.20 percentage points (absolute) with high confidence
Bayesian prior (independent for each group): $p \sim \text{Beta}(\alpha_0=129, \beta_0=871)$ which has prior mean 0.129 and prior strength equivalent to 1,000 pseudo-users

Problem Statement

Using the data above, produce a decision recommendation and a short explanation of how the conclusion differs under frequentist vs Bayesian reasoning.

Requirements

Frequentist analysis
1. State $H_0$ and $H_1$ for a two-proportion z-test.
2. Compute the z-statistic and two-sided p-value.
3. Compute a 95% confidence interval for the absolute lift $p_T - p_C$ (unpooled SE).
4. Decide whether the result is statistically significant and whether it clears the 0.20pp practical threshold.
Bayesian analysis
1. Write down the posterior distributions for $p_C$ and $p_T$ under the Beta-Binomial model.
2. Compute the posterior means.
3. Approximate $\Pr(p_T - p_C > 0)$ and $\Pr(p_T - p_C > 0.002)$ (0.20pp) using a normal approximation to the posterior difference (or Monte Carlo in Python).
Conceptual comparison
- In 4–6 sentences, explain the key differences in interpretation: p-value vs posterior probability, confidence interval vs credible interval, and the role of priors.

Assumptions and Constraints

Users are independently randomized; each user appears once.
Conversion is binary and well-modeled as Binomial within each arm.
Ignore sequential monitoring/peeking for this exercise (but mention it as a caveat in interpretation).
For the Bayesian probability calculations, you may use either (a) Monte Carlo draws from the Beta posteriors or (b) a normal approximation based on posterior means/variances.

Business Context

Given Data

Group	Users (n)	Purchases (x)	Observed conversion
Control (old checkout)	182,450	23,530	12.90%
Treatment (new checkout)	181,980	24,150	13.27%

Additional decision inputs:

Significance level for the frequentist test: $\alpha = 0.05$ (two-sided)
Business threshold: ship only if the conversion lift is > 0.20 percentage points (absolute) with high confidence
Bayesian prior (independent for each group): $p \sim \text{Beta}(\alpha_0=129, \beta_0=871)$ which has prior mean 0.129 and prior strength equivalent to 1,000 pseudo-users

Problem Statement

Using the data above, produce a decision recommendation and a short explanation of how the conclusion differs under frequentist vs Bayesian reasoning.

Requirements

Frequentist analysis
1. State $H_0$ and $H_1$ for a two-proportion z-test.
2. Compute the z-statistic and two-sided p-value.
3. Compute a 95% confidence interval for the absolute lift $p_T - p_C$ (unpooled SE).
4. Decide whether the result is statistically significant and whether it clears the 0.20pp practical threshold.
Bayesian analysis
1. Write down the posterior distributions for $p_C$ and $p_T$ under the Beta-Binomial model.
2. Compute the posterior means.
3. Approximate $\Pr(p_T - p_C > 0)$ and $\Pr(p_T - p_C > 0.002)$ (0.20pp) using a normal approximation to the posterior difference (or Monte Carlo in Python).
Conceptual comparison
- In 4–6 sentences, explain the key differences in interpretation: p-value vs posterior probability, confidence interval vs credible interval, and the role of priors.

Assumptions and Constraints

Users are independently randomized; each user appears once.
Conversion is binary and well-modeled as Binomial within each arm.
Ignore sequential monitoring/peeking for this exercise (but mention it as a caveat in interpretation).
For the Bayesian probability calculations, you may use either (a) Monte Carlo draws from the Beta posteriors or (b) a normal approximation based on posterior means/variances.

Business Context

Given Data

Group	Users (n)	Purchases (x)	Observed conversion
Control (old checkout)	182,450	23,530	12.90%
Treatment (new checkout)	181,980	24,150	13.27%

Additional decision inputs:

Significance level for the frequentist test: $\alpha = 0.05$ (two-sided)
Business threshold: ship only if the conversion lift is > 0.20 percentage points (absolute) with high confidence
Bayesian prior (independent for each group): $p \sim \text{Beta}(\alpha_0=129, \beta_0=871)$ which has prior mean 0.129 and prior strength equivalent to 1,000 pseudo-users

Problem Statement

Using the data above, produce a decision recommendation and a short explanation of how the conclusion differs under frequentist vs Bayesian reasoning.

Requirements

Frequentist analysis
1. State $H_0$ and $H_1$ for a two-proportion z-test.
2. Compute the z-statistic and two-sided p-value.
3. Compute a 95% confidence interval for the absolute lift $p_T - p_C$ (unpooled SE).
4. Decide whether the result is statistically significant and whether it clears the 0.20pp practical threshold.
Bayesian analysis
1. Write down the posterior distributions for $p_C$ and $p_T$ under the Beta-Binomial model.
2. Compute the posterior means.
3. Approximate $\Pr(p_T - p_C > 0)$ and $\Pr(p_T - p_C > 0.002)$ (0.20pp) using a normal approximation to the posterior difference (or Monte Carlo in Python).
Conceptual comparison
- In 4–6 sentences, explain the key differences in interpretation: p-value vs posterior probability, confidence interval vs credible interval, and the role of priors.

Assumptions and Constraints

Users are independently randomized; each user appears once.
Conversion is binary and well-modeled as Binomial within each arm.
Ignore sequential monitoring/peeking for this exercise (but mention it as a caveat in interpretation).
For the Bayesian probability calculations, you may use either (a) Monte Carlo draws from the Beta posteriors or (b) a normal approximation based on posterior means/variances.

Business Context

Given Data

Group	Users (n)	Purchases (x)	Observed conversion
Control (old checkout)	182,450	23,530	12.90%
Treatment (new checkout)	181,980	24,150	13.27%

Additional decision inputs:

Significance level for the frequentist test: $\alpha = 0.05$ (two-sided)
Business threshold: ship only if the conversion lift is > 0.20 percentage points (absolute) with high confidence
Bayesian prior (independent for each group): $p \sim \text{Beta}(\alpha_0=129, \beta_0=871)$ which has prior mean 0.129 and prior strength equivalent to 1,000 pseudo-users

Problem Statement

Using the data above, produce a decision recommendation and a short explanation of how the conclusion differs under frequentist vs Bayesian reasoning.

Requirements

Frequentist analysis
1. State $H_0$ and $H_1$ for a two-proportion z-test.
2. Compute the z-statistic and two-sided p-value.
3. Compute a 95% confidence interval for the absolute lift $p_T - p_C$ (unpooled SE).
4. Decide whether the result is statistically significant and whether it clears the 0.20pp practical threshold.
Bayesian analysis
1. Write down the posterior distributions for $p_C$ and $p_T$ under the Beta-Binomial model.
2. Compute the posterior means.
3. Approximate $\Pr(p_T - p_C > 0)$ and $\Pr(p_T - p_C > 0.002)$ (0.20pp) using a normal approximation to the posterior difference (or Monte Carlo in Python).
Conceptual comparison
- In 4–6 sentences, explain the key differences in interpretation: p-value vs posterior probability, confidence interval vs credible interval, and the role of priors.

Assumptions and Constraints

Users are independently randomized; each user appears once.
Conversion is binary and well-modeled as Binomial within each arm.
Ignore sequential monitoring/peeking for this exercise (but mention it as a caveat in interpretation).
For the Bayesian probability calculations, you may use either (a) Monte Carlo draws from the Beta posteriors or (b) a normal approximation based on posterior means/variances.

Interview Guides

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Bayesian vs Frequentist Checkout Lift

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Your Answer

Bayesian vs Frequentist Checkout Lift

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Bayesian vs Frequentist Checkout Lift

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Your Answer