Dataford
Interview Guides
Upgrade
All questions/Statistics & Probability/Correlation vs Causation in Notifications

Correlation vs Causation in Notifications

Easy
Statistics & Probability
Asked at 2 companies2Hypothesis TestingCorrelationCausal Inference
Also asked at
SpliceIntuit

Problem

Business Context

StreamHub, a video streaming app, noticed that users who enable push notifications appear to watch more content. The product manager wants to know whether notifications cause higher engagement or whether the relationship is driven by more engaged users being more likely to opt in.

Problem Statement

Use the observational data below to distinguish correlation from causation. Quantify the naive relationship between notification opt-in and watch time, then assess whether the relationship remains after controlling for prior engagement.

Given Data

SegmentNotification Opt-InUsersAvg Prior-Week Watch HoursAvg Current-Week Watch Hours
High prior engagementYes80012.013.2
High prior engagementNo20012.012.8
Low prior engagementYes2002.02.4
Low prior engagementNo8002.02.2

Assume the within-segment standard deviation of current-week watch hours is 4.0 hours for all four groups.

Requirements

  1. Compute the overall average current-week watch hours for notification opt-in users and non-opt-in users.
  2. Calculate the naive difference in means and explain why it is correlation, not necessarily causation.
  3. Compute the within-segment treatment effect for high- and low-engagement users.
  4. Estimate the adjusted causal effect by weighting the segment-level effects.
  5. Run a hypothesis test for the adjusted effect using the provided standard deviation and determine whether it is statistically significant at α=0.05\alpha = 0.05α=0.05.
  6. Explain what additional evidence would be needed to make a stronger causal claim.

Assumptions

  • Users were not randomly assigned to opt in; notification choice is self-selected.
  • Prior-week watch hours is a confounder affecting both opt-in behavior and future watch time.
  • Segment-level means are representative, and user outcomes are independent within groups.

Sample Data

Example 1
Input{"low_no_users":800,"high_no_users":200,"low_yes_users":200,"high_yes_users":800,"within_group_sd":4,"significance_level":0.05,"low_prior_watch_hours":2,"high_prior_watch_hours":12,"low_no_current_watch_hours":2.2,"high_no_current_watch_hours":12.8,"low_yes_current_watch_hours":2.4,"high_yes_current_watch_hours":13.2}Output(none)

Problem

Business Context

StreamHub, a video streaming app, noticed that users who enable push notifications appear to watch more content. The product manager wants to know whether notifications cause higher engagement or whether the relationship is driven by more engaged users being more likely to opt in.

Problem Statement

Use the observational data below to distinguish correlation from causation. Quantify the naive relationship between notification opt-in and watch time, then assess whether the relationship remains after controlling for prior engagement.

Given Data

SegmentNotification Opt-InUsersAvg Prior-Week Watch HoursAvg Current-Week Watch Hours
High prior engagementYes80012.013.2
High prior engagementNo20012.012.8
Low prior engagementYes2002.02.4
Low prior engagementNo8002.02.2

Assume the within-segment standard deviation of current-week watch hours is 4.0 hours for all four groups.

Requirements

  1. Compute the overall average current-week watch hours for notification opt-in users and non-opt-in users.
  2. Calculate the naive difference in means and explain why it is correlation, not necessarily causation.
  3. Compute the within-segment treatment effect for high- and low-engagement users.
  4. Estimate the adjusted causal effect by weighting the segment-level effects.
  5. Run a hypothesis test for the adjusted effect using the provided standard deviation and determine whether it is statistically significant at α=0.05\alpha = 0.05α=0.05.
  6. Explain what additional evidence would be needed to make a stronger causal claim.

Assumptions

  • Users were not randomly assigned to opt in; notification choice is self-selected.
  • Prior-week watch hours is a confounder affecting both opt-in behavior and future watch time.
  • Segment-level means are representative, and user outcomes are independent within groups.

Sample Data

Example 1
Input{"low_no_users":800,"high_no_users":200,"low_yes_users":200,"high_yes_users":800,"within_group_sd":4,"significance_level":0.05,"low_prior_watch_hours":2,"high_prior_watch_hours":12,"low_no_current_watch_hours":2.2,"high_no_current_watch_hours":12.8,"low_yes_current_watch_hours":2.4,"high_yes_current_watch_hours":13.2}Output(none)
Your answer
Try one AI text evaluation on us
Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.
0 wordstarget ~200
Up next
QuoraCorrelation vs Causation in NotificationsEasyIntuitCorrelation vs Causation in NotificationsEasyBairdCorrelation vs Causation in MarketingEasy
Next question