Diagnose Linear Regression Assumptions in Pricing

Business Context

RidePulse is a large ride-hailing marketplace (~8M weekly active riders across the US). The pricing team is evaluating a new “smart surge” algorithm and wants a quick model of price elasticity: how much ride requests drop when the effective price increases. A data scientist fits an OLS linear regression on a random sample of sessions to estimate the relationship between demand and price.

The model is used to make a high-stakes decision: whether to roll out smart surge nationwide (expected to move weekly revenue by 1–3%). Leadership asks you to sanity-check whether the linear regression assumptions are plausible and whether the inference (p-values / confidence intervals) can be trusted.

Model

You fit the following OLS model at the city-hour level:

$\text{log}(\text{requests}) = \beta_0 + \beta_1\,\text{log}(\text{price}) + \beta_2\,\text{rain} + \beta_3\,\text{event} + \epsilon$

where rain and event are binary indicators.

Given Data

You’re given the regression output and a few diagnostic summaries:

Item	Value
Observations (n)	2,400
Parameters incl. intercept (p)	4
OLS estimate for $\beta_1$ (log-price)	-1.18
Conventional (non-robust) SE for $\hat\beta_1$	0.21
Heteroskedasticity-robust (HC1) SE for $\hat\beta_1$	0.34
Correlation between $	\hat\epsilon
Durbin–Watson statistic (residuals ordered by time within city)	1.05
Mean of residuals	~0.000
Mean of fitted residuals by city (min, median, max)	(-0.22, 0.01, 0.19)

Assume the coefficient estimate $\hat\beta_1$ is unchanged; only the standard error differs depending on assumptions.

Task

Answer the following, focusing on assumptions of linear regression and how violations affect business conclusions.

Requirements

List the core OLS linear regression assumptions required for (a) unbiasedness of $\hat\beta$ and (b) valid standard errors / hypothesis tests.
Using the provided diagnostics, identify which assumptions look most questionable (be explicit: heteroskedasticity, autocorrelation, omitted variables, nonlinearity, etc.).
Test $H_0: \beta_1 = 0$ $H_{0} : β_{1} = 0$ vs $H_1: \beta_1 eq 0$ $H_{1} : β_{1} e q 0$ at $\alpha = 0.05$ $α = 0.05$ using:
- the conventional SE
- the robust (HC1) SE Report the t-statistics and p-values.
Compute a 95% confidence interval for $\beta_1$ under both SE choices.
Interpret the difference in conclusions in business terms: would you still claim price elasticity is “statistically significant”? What would you recommend shipping/monitoring next?

Assumptions & Constraints

Treat $n=2400$ as large enough to use a normal approximation for the t-statistic.
You cannot refit the model or collect new data during the interview; you must reason from the diagnostics.
The goal is not to “name every possible issue,” but to connect assumption violations to inference and decision-making.

Business Context

Model

You fit the following OLS model at the city-hour level:

$\text{log}(\text{requests}) = \beta_0 + \beta_1\,\text{log}(\text{price}) + \beta_2\,\text{rain} + \beta_3\,\text{event} + \epsilon$

where rain and event are binary indicators.

Given Data

You’re given the regression output and a few diagnostic summaries:

Item	Value
Observations (n)	2,400
Parameters incl. intercept (p)	4
OLS estimate for $\beta_1$ (log-price)	-1.18
Conventional (non-robust) SE for $\hat\beta_1$	0.21
Heteroskedasticity-robust (HC1) SE for $\hat\beta_1$	0.34
Correlation between $	\hat\epsilon
Durbin–Watson statistic (residuals ordered by time within city)	1.05
Mean of residuals	~0.000
Mean of fitted residuals by city (min, median, max)	(-0.22, 0.01, 0.19)

Assume the coefficient estimate $\hat\beta_1$ is unchanged; only the standard error differs depending on assumptions.

Task

Answer the following, focusing on assumptions of linear regression and how violations affect business conclusions.

Requirements

List the core OLS linear regression assumptions required for (a) unbiasedness of $\hat\beta$ and (b) valid standard errors / hypothesis tests.
Using the provided diagnostics, identify which assumptions look most questionable (be explicit: heteroskedasticity, autocorrelation, omitted variables, nonlinearity, etc.).
Test $H_0: \beta_1 = 0$ $H_{0} : β_{1} = 0$ vs $H_1: \beta_1 eq 0$ $H_{1} : β_{1} e q 0$ at $\alpha = 0.05$ $α = 0.05$ using:
- the conventional SE
- the robust (HC1) SE Report the t-statistics and p-values.
Compute a 95% confidence interval for $\beta_1$ under both SE choices.
Interpret the difference in conclusions in business terms: would you still claim price elasticity is “statistically significant”? What would you recommend shipping/monitoring next?

Assumptions & Constraints

Treat $n=2400$ as large enough to use a normal approximation for the t-statistic.
You cannot refit the model or collect new data during the interview; you must reason from the diagnostics.
The goal is not to “name every possible issue,” but to connect assumption violations to inference and decision-making.

Business Context

Model

You fit the following OLS model at the city-hour level:

$\text{log}(\text{requests}) = \beta_0 + \beta_1\,\text{log}(\text{price}) + \beta_2\,\text{rain} + \beta_3\,\text{event} + \epsilon$

where rain and event are binary indicators.

Given Data

You’re given the regression output and a few diagnostic summaries:

Item	Value
Observations (n)	2,400
Parameters incl. intercept (p)	4
OLS estimate for $\beta_1$ (log-price)	-1.18
Conventional (non-robust) SE for $\hat\beta_1$	0.21
Heteroskedasticity-robust (HC1) SE for $\hat\beta_1$	0.34
Correlation between $	\hat\epsilon
Durbin–Watson statistic (residuals ordered by time within city)	1.05
Mean of residuals	~0.000
Mean of fitted residuals by city (min, median, max)	(-0.22, 0.01, 0.19)

Assume the coefficient estimate $\hat\beta_1$ is unchanged; only the standard error differs depending on assumptions.

Task

Answer the following, focusing on assumptions of linear regression and how violations affect business conclusions.

Requirements

List the core OLS linear regression assumptions required for (a) unbiasedness of $\hat\beta$ and (b) valid standard errors / hypothesis tests.
Using the provided diagnostics, identify which assumptions look most questionable (be explicit: heteroskedasticity, autocorrelation, omitted variables, nonlinearity, etc.).
Test $H_0: \beta_1 = 0$ $H_{0} : β_{1} = 0$ vs $H_1: \beta_1 eq 0$ $H_{1} : β_{1} e q 0$ at $\alpha = 0.05$ $α = 0.05$ using:
- the conventional SE
- the robust (HC1) SE Report the t-statistics and p-values.
Compute a 95% confidence interval for $\beta_1$ under both SE choices.
Interpret the difference in conclusions in business terms: would you still claim price elasticity is “statistically significant”? What would you recommend shipping/monitoring next?

Assumptions & Constraints

Treat $n=2400$ as large enough to use a normal approximation for the t-statistic.
You cannot refit the model or collect new data during the interview; you must reason from the diagnostics.
The goal is not to “name every possible issue,” but to connect assumption violations to inference and decision-making.

Business Context

Model

You fit the following OLS model at the city-hour level:

$\text{log}(\text{requests}) = \beta_0 + \beta_1\,\text{log}(\text{price}) + \beta_2\,\text{rain} + \beta_3\,\text{event} + \epsilon$

where rain and event are binary indicators.

Given Data

You’re given the regression output and a few diagnostic summaries:

Item	Value
Observations (n)	2,400
Parameters incl. intercept (p)	4
OLS estimate for $\beta_1$ (log-price)	-1.18
Conventional (non-robust) SE for $\hat\beta_1$	0.21
Heteroskedasticity-robust (HC1) SE for $\hat\beta_1$	0.34
Correlation between $	\hat\epsilon
Durbin–Watson statistic (residuals ordered by time within city)	1.05
Mean of residuals	~0.000
Mean of fitted residuals by city (min, median, max)	(-0.22, 0.01, 0.19)

Assume the coefficient estimate $\hat\beta_1$ is unchanged; only the standard error differs depending on assumptions.

Task

Answer the following, focusing on assumptions of linear regression and how violations affect business conclusions.

Requirements

List the core OLS linear regression assumptions required for (a) unbiasedness of $\hat\beta$ and (b) valid standard errors / hypothesis tests.
Using the provided diagnostics, identify which assumptions look most questionable (be explicit: heteroskedasticity, autocorrelation, omitted variables, nonlinearity, etc.).
Test $H_0: \beta_1 = 0$ $H_{0} : β_{1} = 0$ vs $H_1: \beta_1 eq 0$ $H_{1} : β_{1} e q 0$ at $\alpha = 0.05$ $α = 0.05$ using:
- the conventional SE
- the robust (HC1) SE Report the t-statistics and p-values.
Compute a 95% confidence interval for $\beta_1$ under both SE choices.
Interpret the difference in conclusions in business terms: would you still claim price elasticity is “statistically significant”? What would you recommend shipping/monitoring next?

Assumptions & Constraints

Treat $n=2400$ as large enough to use a normal approximation for the t-statistic.
You cannot refit the model or collect new data during the interview; you must reason from the diagnostics.
The goal is not to “name every possible issue,” but to connect assumption violations to inference and decision-making.

Interview Guides

Business Context

Model

Given Data

Task

Requirements

Assumptions & Constraints

Diagnose Linear Regression Assumptions in Pricing

Business Context

Model

Given Data

Task

Requirements

Assumptions & Constraints

Your Answer

Diagnose Linear Regression Assumptions in Pricing

Business Context

Model

Given Data

Task

Requirements

Assumptions & Constraints

Diagnose Linear Regression Assumptions in Pricing

Business Context

Model

Given Data

Task

Requirements

Assumptions & Constraints

Your Answer