Context
The Databricks demand generation team wants to improve conversion from a nurture email campaign that drives prospects to request a product demo. They are considering a new email creative in Marketo with a stronger CTA and more prominent Databricks Lakehouse messaging.
Hypothesis Seed
The team believes the new email will increase demo-request conversion among eligible leads by making the value proposition clearer and reducing friction on the landing path. However, a false positive is costly because the new creative would be rolled out to a large global nurture program, while a false negative means delaying a potentially meaningful pipeline gain for one quarter.
Constraints
- Eligible traffic: 120,000 marketable leads per week across North America and EMEA
- Current email send cadence: one nurture email per lead per week
- Baseline demo-request conversion within 7 days of send: 3.2%
- Maximum experiment duration: 3 weeks, because the QBR launch calendar requires a decision
- Allocation can be 50/50 after a small instrumentation ramp
- The team wants to detect at least a 10% relative lift in demo-request conversion
- Guardrails: unsubscribe rate and spam complaint rate cannot materially worsen
Deliverables
- Define the null and alternative hypotheses, the primary metric, 2-4 guardrails, and at least one secondary metric. Be explicit about the unit of randomization and unit of analysis.
- Calculate the required sample size per arm using a clearly stated alpha, power, baseline rate, and MDE. Translate that into expected runtime given the available Databricks campaign traffic.
- Propose the experiment design in Databricks terms: allocation, duration, any stratification, and whether you would use CUPED or another variance-reduction approach.
- Pre-register the analysis plan: statistical test, peeking policy, multiple-comparison treatment, SRM checks, and how you will handle any mismatch between randomization and analysis units.
- State a clear ship / do-not-ship / iterate rule that respects guardrails, and list the main pitfalls that could invalidate the result.