Build a Media Mix Model

Business Context

You’re a senior data scientist at StreamWave, a subscription video streaming service with 18M monthly active users across the US and Canada. Marketing spend is ~$45M/month across TV, Paid Search, Paid Social, YouTube, Display, and Affiliate. The CFO is pushing to reallocate budget next quarter and wants a defensible estimate of incremental revenue and marginal ROI by channel.

Historically, the team has relied on last-click attribution, but it over-credits Paid Search and under-credits TV/YouTube. You are asked to build a Media Mix Model (MMM) from scratch using observational time-series data.

Problem Statement

Design an MMM approach that can (a) estimate the incremental contribution of each channel while accounting for adstock/carryover, diminishing returns, and seasonality, and (b) produce uncertainty estimates that are credible enough for finance.

To make the discussion concrete, assume you have 104 weeks of weekly data. You will:

Specify the minimum data you need (what tables/fields, grain, and key joins).
Propose a baseline MMM specification (equation + transformations).
Using the provided simplified regression output, test whether TV has statistically significant incremental impact at $\alpha=0.05$ .
Compute a 95% confidence interval for TV’s incremental revenue per $1K spend (holding other variables fixed).
Translate the result into a budget recommendation and list the top modeling risks.

Given Data (Simplified)

You fit an OLS MMM on weekly revenue (in $M) with controls and transformed media:

$Revenue_t = \beta_0 + \beta_{TV} \cdot TV^{adstock}_t + \beta_{Search} \cdot Search^{log}_t + \beta_{Social} \cdot Social^{log}_t + \gamma \cdot PriceIndex_t + \delta \cdot Promo_t + \text{Seasonality}_t + \epsilon_t$

Where:

$TV^{adstock}_t$ is weekly TV spend in $K after adstock (carryover).
$Search^{log}_t, Social^{log}_t$ are $\log(1+spend)$ transforms (diminishing returns).
Seasonality is captured via week-of-year Fourier terms.

Regression output for TV term only (others omitted here):

Term	Estimate	Std. Error	Notes
$\beta_{TV}$	0.00162	0.00071	Revenue in $M per$ K adstocked TV spend

Assume:

$n = 104$ weeks
Total parameters in the model $p = 18$ (including intercept + controls + seasonality)
Residuals are approximately normal; you will use a t-test with $df = n - p$

Requirements

Data requirements: list the datasets/fields you need (media, outcomes, controls, and metadata), including how you would handle:
- channel definitions (impressions vs spend)
- geo granularity (national vs DMA) and why it matters
- offline conversions and delayed revenue recognition
Model design: specify adstock and saturation choices and justify them.
Hypothesis test: $H_0: \beta_{TV}=0$ vs $H_1: \beta_{TV} eq 0$ . Compute the t-statistic and p-value.
95% CI for $\beta_{TV}$ and interpret it as incremental $revenue per \$ 1K TV spend.
Business interpretation: what would you tell the CFO, and what are the top caveats (confounding, multicollinearity, autocorrelation, measurement error, and policy changes like pricing/promos)?

Assumptions and Constraints

Weekly aggregation; spend is measured accurately, but TV GRPs are not available (spend only).
No randomized geo experiments exist for this period.
Potential autocorrelation in $\epsilon_t$ ; you may comment on Newey–West/HAC as a robustness check.

Business Context

Problem Statement

To make the discussion concrete, assume you have 104 weeks of weekly data. You will:

Specify the minimum data you need (what tables/fields, grain, and key joins).
Propose a baseline MMM specification (equation + transformations).
Using the provided simplified regression output, test whether TV has statistically significant incremental impact at $\alpha=0.05$ .
Compute a 95% confidence interval for TV’s incremental revenue per $1K spend (holding other variables fixed).
Translate the result into a budget recommendation and list the top modeling risks.

Given Data (Simplified)

You fit an OLS MMM on weekly revenue (in $M) with controls and transformed media:

Where:

$TV^{adstock}_t$ is weekly TV spend in $K after adstock (carryover).
$Search^{log}_t, Social^{log}_t$ are $\log(1+spend)$ transforms (diminishing returns).
Seasonality is captured via week-of-year Fourier terms.

Regression output for TV term only (others omitted here):

Term	Estimate	Std. Error	Notes
$\beta_{TV}$	0.00162	0.00071	Revenue in $M per$ K adstocked TV spend

Assume:

$n = 104$ weeks
Total parameters in the model $p = 18$ (including intercept + controls + seasonality)
Residuals are approximately normal; you will use a t-test with $df = n - p$

Requirements

Data requirements: list the datasets/fields you need (media, outcomes, controls, and metadata), including how you would handle:
- channel definitions (impressions vs spend)
- geo granularity (national vs DMA) and why it matters
- offline conversions and delayed revenue recognition
Model design: specify adstock and saturation choices and justify them.
Hypothesis test: $H_0: \beta_{TV}=0$ vs $H_1: \beta_{TV} eq 0$ . Compute the t-statistic and p-value.
95% CI for $\beta_{TV}$ and interpret it as incremental $revenue per \$ 1K TV spend.
Business interpretation: what would you tell the CFO, and what are the top caveats (confounding, multicollinearity, autocorrelation, measurement error, and policy changes like pricing/promos)?

Assumptions and Constraints

Weekly aggregation; spend is measured accurately, but TV GRPs are not available (spend only).
No randomized geo experiments exist for this period.
Potential autocorrelation in $\epsilon_t$ ; you may comment on Newey–West/HAC as a robustness check.

Business Context

Problem Statement

To make the discussion concrete, assume you have 104 weeks of weekly data. You will:

Specify the minimum data you need (what tables/fields, grain, and key joins).
Propose a baseline MMM specification (equation + transformations).
Using the provided simplified regression output, test whether TV has statistically significant incremental impact at $\alpha=0.05$ .
Compute a 95% confidence interval for TV’s incremental revenue per $1K spend (holding other variables fixed).
Translate the result into a budget recommendation and list the top modeling risks.

Given Data (Simplified)

You fit an OLS MMM on weekly revenue (in $M) with controls and transformed media:

Where:

$TV^{adstock}_t$ is weekly TV spend in $K after adstock (carryover).
$Search^{log}_t, Social^{log}_t$ are $\log(1+spend)$ transforms (diminishing returns).
Seasonality is captured via week-of-year Fourier terms.

Regression output for TV term only (others omitted here):

Term	Estimate	Std. Error	Notes
$\beta_{TV}$	0.00162	0.00071	Revenue in $M per$ K adstocked TV spend

Assume:

$n = 104$ weeks
Total parameters in the model $p = 18$ (including intercept + controls + seasonality)
Residuals are approximately normal; you will use a t-test with $df = n - p$

Requirements

Data requirements: list the datasets/fields you need (media, outcomes, controls, and metadata), including how you would handle:
- channel definitions (impressions vs spend)
- geo granularity (national vs DMA) and why it matters
- offline conversions and delayed revenue recognition
Model design: specify adstock and saturation choices and justify them.
Hypothesis test: $H_0: \beta_{TV}=0$ vs $H_1: \beta_{TV} eq 0$ . Compute the t-statistic and p-value.
95% CI for $\beta_{TV}$ and interpret it as incremental $revenue per \$ 1K TV spend.
Business interpretation: what would you tell the CFO, and what are the top caveats (confounding, multicollinearity, autocorrelation, measurement error, and policy changes like pricing/promos)?

Assumptions and Constraints

Weekly aggregation; spend is measured accurately, but TV GRPs are not available (spend only).
No randomized geo experiments exist for this period.
Potential autocorrelation in $\epsilon_t$ ; you may comment on Newey–West/HAC as a robustness check.

Business Context

Problem Statement

To make the discussion concrete, assume you have 104 weeks of weekly data. You will:

Specify the minimum data you need (what tables/fields, grain, and key joins).
Propose a baseline MMM specification (equation + transformations).
Using the provided simplified regression output, test whether TV has statistically significant incremental impact at $\alpha=0.05$ .
Compute a 95% confidence interval for TV’s incremental revenue per $1K spend (holding other variables fixed).
Translate the result into a budget recommendation and list the top modeling risks.

Given Data (Simplified)

You fit an OLS MMM on weekly revenue (in $M) with controls and transformed media:

Where:

$TV^{adstock}_t$ is weekly TV spend in $K after adstock (carryover).
$Search^{log}_t, Social^{log}_t$ are $\log(1+spend)$ transforms (diminishing returns).
Seasonality is captured via week-of-year Fourier terms.

Regression output for TV term only (others omitted here):

Term	Estimate	Std. Error	Notes
$\beta_{TV}$	0.00162	0.00071	Revenue in $M per$ K adstocked TV spend

Assume:

$n = 104$ weeks
Total parameters in the model $p = 18$ (including intercept + controls + seasonality)
Residuals are approximately normal; you will use a t-test with $df = n - p$

Requirements

Data requirements: list the datasets/fields you need (media, outcomes, controls, and metadata), including how you would handle:
- channel definitions (impressions vs spend)
- geo granularity (national vs DMA) and why it matters
- offline conversions and delayed revenue recognition
Model design: specify adstock and saturation choices and justify them.
Hypothesis test: $H_0: \beta_{TV}=0$ vs $H_1: \beta_{TV} eq 0$ . Compute the t-statistic and p-value.
95% CI for $\beta_{TV}$ and interpret it as incremental $revenue per \$ 1K TV spend.
Business interpretation: what would you tell the CFO, and what are the top caveats (confounding, multicollinearity, autocorrelation, measurement error, and policy changes like pricing/promos)?

Assumptions and Constraints

Weekly aggregation; spend is measured accurately, but TV GRPs are not available (spend only).
No randomized geo experiments exist for this period.
Potential autocorrelation in $\epsilon_t$ ; you may comment on Newey–West/HAC as a robustness check.

Interview Guides

Business Context

Problem Statement

Given Data (Simplified)

Requirements

Assumptions and Constraints

Build a Media Mix Model

Business Context

Problem Statement

Given Data (Simplified)

Requirements

Assumptions and Constraints

Your Answer

Build a Media Mix Model

Business Context

Problem Statement

Given Data (Simplified)

Requirements

Assumptions and Constraints

Build a Media Mix Model

Business Context

Problem Statement

Given Data (Simplified)

Requirements

Assumptions and Constraints

Your Answer