Business Context
DashDrop is a last‑mile delivery marketplace (food + convenience) with ~6M monthly active users across the US. A growth PM claims that “faster deliveries cause higher order value,” and wants to prioritize a costly initiative to reduce delivery time by 3–5 minutes.
You’re asked to do an initial quantitative check using a random sample of n = 2,400 completed orders from the last 30 days (one order per user to reduce repeat-user dependence). For each order you have:
- x = delivery_time_min: minutes from checkout to drop-off
- y = order_value_usd: pre-tip basket value in USD
The analyst who pulled the sample computed the following summary statistics:
- Mean delivery time: x̄ = 34.8 minutes
- SD delivery time: sₓ = 12.1 minutes
- Mean order value: ȳ = $27.40
- SD order value: sᵧ = $18.6
- Sample covariance: sₓᵧ = −38.2 (minute·USD)
The PM wants a single number (“the correlation”) and a recommendation on whether this supports prioritizing the initiative.
Given Data
| Quantity | Value |
|---|
| Sample size (n) | 2,400 |
| x̄ (mean delivery time, min) | 34.8 |
| sₓ (SD delivery time, min) | 12.1 |
| ȳ (mean order value, USD) | 27.40 |
| sᵧ (SD order value, USD) | 18.6 |
| sₓᵧ (sample covariance, min·USD) | −38.2 |
| Significance level (α) | 0.05 |
Problem Statement
Compute the Pearson correlation between delivery time and order value, quantify uncertainty, and test whether the correlation is different from zero.
Requirements
- Calculate the sample Pearson correlation r from the provided covariance and standard deviations.
- Construct a 95% confidence interval for the true correlation ρ using the Fisher z-transform.
- Perform a hypothesis test for H₀: ρ = 0 vs H₁: ρ ≠ 0 at α = 0.05 (report a p-value).
- Translate the correlation into an equivalent linear slope (USD per minute) using the relationship between correlation and simple linear regression.
- Provide a business interpretation: does this analysis justify prioritizing the “reduce delivery time” initiative?
Assumptions and Constraints
- Treat the 2,400 orders as i.i.d. (one order per user helps, but city/restaurant clustering may remain).
- Pearson correlation captures linear association; non-linear effects (e.g., very late deliveries) may exist.
- Correlation is not causation; confounding is plausible (distance, basket size, restaurant prep time, promotions).
- You may assume n is large enough for asymptotic approximations (Fisher transform, t-test) to be reasonable.