Quantify NPS Themes vs Conversion

Business Context

RideWave is a large ride-hailing marketplace (~8M monthly active riders) that recently shipped a redesigned “Schedule a Ride” flow. The team ran a 14-day A/B test and also collected post-ride NPS surveys with an open-text prompt: “What’s the main reason for your score?”.

Leaders are debating whether the redesign improved the experience because of qualitative feedback (themes in text like “confusing UI” or “pricing transparency”) or because it actually changed quantitative outcomes (conversion and NPS score). Your job is to connect the two: show you understand the difference between qualitative vs quantitative data, and demonstrate how to rigorously analyze both in one experiment.

A text-analytics pipeline (human-labeled + model-assisted) assigns each comment a binary indicator for whether it mentions the theme “confusing UI”. You are told the classifier has precision = 0.90 and recall = 0.80 for that theme, and performance is stable across variants.

Given Data

Metric	Control (A)	Treatment (B)
Eligible sessions (n)	120,480	119,920
Completed scheduled rides (conversions)	14,458	15,110
NPS responses (m)	18,240	18,010
Mean NPS score	34.6	36.1
Std dev of NPS score	28.0	27.5
Comments flagged “confusing UI” (observed)	2,190	1,845

Additional info: each session belongs to a unique user (no repeated measures), randomization is at the user level, and you can treat NPS scores as approximately continuous for inference.

Problem Statement

You need to decide whether the redesign should roll out globally. Do this by (a) clearly distinguishing qualitative vs quantitative data in this setting, and (b) running appropriate statistical tests on the quantitative summaries while accounting for measurement error in the qualitative theme label.

Requirements

Identify which fields are qualitative vs quantitative in this scenario, and explain (in 3–5 sentences) why that distinction matters for inference and decision-making.
Test whether the redesign changed conversion rate using a two-proportion z-test at $\alpha = 0.05$ . Report the z-statistic and p-value.
Compute a 95% confidence interval for the conversion-rate lift $p_B - p_A$ using an unpooled standard error.
Test whether the redesign changed the prevalence of the “confusing UI” theme using a two-proportion z-test on the observed flags.
Adjust the theme prevalence for classifier error (given precision/recall) to estimate the true theme rate in each group, then recompute the estimated difference in true theme rates.
(Stretch) Quantify whether the theme is associated with conversion by fitting a simple logistic regression using aggregated data: assume among flagged comments, conversion is 9.0% in A and 9.6% in B; among unflagged, conversion is 12.4% in A and 12.9% in B. Interpret the sign of the theme coefficient.

Assumptions and Constraints

Random assignment and independence across users.
Large-sample normal approximation is acceptable for proportion tests.
Theme classifier errors are nondifferential by variant (same precision/recall in A and B).
NPS is secondary here; you may comment on it, but the required calculations focus on conversion and theme prevalence.

Business Context

Given Data

Metric	Control (A)	Treatment (B)
Eligible sessions (n)	120,480	119,920
Completed scheduled rides (conversions)	14,458	15,110
NPS responses (m)	18,240	18,010
Mean NPS score	34.6	36.1
Std dev of NPS score	28.0	27.5
Comments flagged “confusing UI” (observed)	2,190	1,845

Additional info: each session belongs to a unique user (no repeated measures), randomization is at the user level, and you can treat NPS scores as approximately continuous for inference.

Problem Statement

Requirements

Identify which fields are qualitative vs quantitative in this scenario, and explain (in 3–5 sentences) why that distinction matters for inference and decision-making.
Test whether the redesign changed conversion rate using a two-proportion z-test at $\alpha = 0.05$ . Report the z-statistic and p-value.
Compute a 95% confidence interval for the conversion-rate lift $p_B - p_A$ using an unpooled standard error.
Test whether the redesign changed the prevalence of the “confusing UI” theme using a two-proportion z-test on the observed flags.
Adjust the theme prevalence for classifier error (given precision/recall) to estimate the true theme rate in each group, then recompute the estimated difference in true theme rates.
(Stretch) Quantify whether the theme is associated with conversion by fitting a simple logistic regression using aggregated data: assume among flagged comments, conversion is 9.0% in A and 9.6% in B; among unflagged, conversion is 12.4% in A and 12.9% in B. Interpret the sign of the theme coefficient.

Assumptions and Constraints

Random assignment and independence across users.
Large-sample normal approximation is acceptable for proportion tests.
Theme classifier errors are nondifferential by variant (same precision/recall in A and B).
NPS is secondary here; you may comment on it, but the required calculations focus on conversion and theme prevalence.

Business Context

Given Data

Metric	Control (A)	Treatment (B)
Eligible sessions (n)	120,480	119,920
Completed scheduled rides (conversions)	14,458	15,110
NPS responses (m)	18,240	18,010
Mean NPS score	34.6	36.1
Std dev of NPS score	28.0	27.5
Comments flagged “confusing UI” (observed)	2,190	1,845

Additional info: each session belongs to a unique user (no repeated measures), randomization is at the user level, and you can treat NPS scores as approximately continuous for inference.

Problem Statement

Requirements

Identify which fields are qualitative vs quantitative in this scenario, and explain (in 3–5 sentences) why that distinction matters for inference and decision-making.
Test whether the redesign changed conversion rate using a two-proportion z-test at $\alpha = 0.05$ . Report the z-statistic and p-value.
Compute a 95% confidence interval for the conversion-rate lift $p_B - p_A$ using an unpooled standard error.
Test whether the redesign changed the prevalence of the “confusing UI” theme using a two-proportion z-test on the observed flags.
Adjust the theme prevalence for classifier error (given precision/recall) to estimate the true theme rate in each group, then recompute the estimated difference in true theme rates.
(Stretch) Quantify whether the theme is associated with conversion by fitting a simple logistic regression using aggregated data: assume among flagged comments, conversion is 9.0% in A and 9.6% in B; among unflagged, conversion is 12.4% in A and 12.9% in B. Interpret the sign of the theme coefficient.

Assumptions and Constraints

Random assignment and independence across users.
Large-sample normal approximation is acceptable for proportion tests.
Theme classifier errors are nondifferential by variant (same precision/recall in A and B).
NPS is secondary here; you may comment on it, but the required calculations focus on conversion and theme prevalence.

Business Context

Given Data

Metric	Control (A)	Treatment (B)
Eligible sessions (n)	120,480	119,920
Completed scheduled rides (conversions)	14,458	15,110
NPS responses (m)	18,240	18,010
Mean NPS score	34.6	36.1
Std dev of NPS score	28.0	27.5
Comments flagged “confusing UI” (observed)	2,190	1,845

Additional info: each session belongs to a unique user (no repeated measures), randomization is at the user level, and you can treat NPS scores as approximately continuous for inference.

Problem Statement

Requirements

Identify which fields are qualitative vs quantitative in this scenario, and explain (in 3–5 sentences) why that distinction matters for inference and decision-making.
Test whether the redesign changed conversion rate using a two-proportion z-test at $\alpha = 0.05$ . Report the z-statistic and p-value.
Compute a 95% confidence interval for the conversion-rate lift $p_B - p_A$ using an unpooled standard error.
Test whether the redesign changed the prevalence of the “confusing UI” theme using a two-proportion z-test on the observed flags.
Adjust the theme prevalence for classifier error (given precision/recall) to estimate the true theme rate in each group, then recompute the estimated difference in true theme rates.
(Stretch) Quantify whether the theme is associated with conversion by fitting a simple logistic regression using aggregated data: assume among flagged comments, conversion is 9.0% in A and 9.6% in B; among unflagged, conversion is 12.4% in A and 12.9% in B. Interpret the sign of the theme coefficient.

Assumptions and Constraints

Random assignment and independence across users.
Large-sample normal approximation is acceptable for proportion tests.
Theme classifier errors are nondifferential by variant (same precision/recall in A and B).
NPS is secondary here; you may comment on it, but the required calculations focus on conversion and theme prevalence.

Interview Guides

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Quantify NPS Themes vs Conversion

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Your Answer

Quantify NPS Themes vs Conversion

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Quantify NPS Themes vs Conversion

Business Context

Given Data

Problem Statement

Requirements

Assumptions and Constraints

Your Answer