Dataford
Interview Guides
Upgrade
All questions/A/B Testing & Experimentation/Regional Lift in Aircall Test

Regional Lift in Aircall Test

Medium
A/B Testing & Experimentation
Asked at 1 company1A/B TestingExperimentationCausal Inference
Also asked at
A

Problem

Context

Aircall is testing a new onboarding prompt inside the Aircall Workspace that encourages newly invited teammates to connect their number, install the desktop app, and place their first call. After an initial readout, the PM says the treatment appears positive in France but flat in North America, and asks whether to ship globally, ship regionally, or keep testing.

Hypothesis Seed

The team believes a localized onboarding prompt with clearer setup steps will increase 7-day activation for newly invited seats by reducing setup friction. However, regional differences in language, call workflows, and sales-assist motion may cause heterogeneous treatment effects.

Constraints

  • Eligible traffic: 18,000 newly invited seats per week across paid Aircall workspaces
  • Region mix: 40% France, 45% North America, 15% Rest of Europe
  • Baseline 7-day activation rate: 32% in France, 28% in North America
  • Maximum experiment runtime: 4 weeks before the onboarding team must decide on rollout
  • False positive cost: medium-high, because shipping a weak prompt globally adds UX clutter and engineering maintenance
  • False negative cost: medium, because delaying a real activation improvement slows seat adoption and downstream calling volume

Deliverables

  1. Define the primary metric, 2-4 guardrails, and an explicit MDE for the overall test and explain whether region-level effects are confirmatory or exploratory.
  2. Calculate the sample size needed and assess whether the test is powered for the global effect, each region separately, or only the largest regions within the 4-week limit.
  3. Choose the unit of randomization, allocation, duration, and any stratification/blocking. Explain how you would handle regional heterogeneity in the design.
  4. Write a pre-registered analysis plan covering the main test, treatment-by-region interaction, multiple-comparison policy, and peeking policy.
  5. Explain how you would interpret a result where the experiment shows a lift in France but not in North America, including what additional checks you would run before recommending global ship, regional ship, iterate, or no-ship.

Problem

Context

Aircall is testing a new onboarding prompt inside the Aircall Workspace that encourages newly invited teammates to connect their number, install the desktop app, and place their first call. After an initial readout, the PM says the treatment appears positive in France but flat in North America, and asks whether to ship globally, ship regionally, or keep testing.

Hypothesis Seed

The team believes a localized onboarding prompt with clearer setup steps will increase 7-day activation for newly invited seats by reducing setup friction. However, regional differences in language, call workflows, and sales-assist motion may cause heterogeneous treatment effects.

Constraints

  • Eligible traffic: 18,000 newly invited seats per week across paid Aircall workspaces
  • Region mix: 40% France, 45% North America, 15% Rest of Europe
  • Baseline 7-day activation rate: 32% in France, 28% in North America
  • Maximum experiment runtime: 4 weeks before the onboarding team must decide on rollout
  • False positive cost: medium-high, because shipping a weak prompt globally adds UX clutter and engineering maintenance
  • False negative cost: medium, because delaying a real activation improvement slows seat adoption and downstream calling volume

Deliverables

  1. Define the primary metric, 2-4 guardrails, and an explicit MDE for the overall test and explain whether region-level effects are confirmatory or exploratory.
  2. Calculate the sample size needed and assess whether the test is powered for the global effect, each region separately, or only the largest regions within the 4-week limit.
  3. Choose the unit of randomization, allocation, duration, and any stratification/blocking. Explain how you would handle regional heterogeneity in the design.
  4. Write a pre-registered analysis plan covering the main test, treatment-by-region interaction, multiple-comparison policy, and peeking policy.
  5. Explain how you would interpret a result where the experiment shows a lift in France but not in North America, including what additional checks you would run before recommending global ship, regional ship, iterate, or no-ship.
Your answer
Try one AI text evaluation on us
Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.
0 wordstarget ~200
Up next
ACompeting Aircall Onboarding ExperimentsHardSpliceRetention Lift A/B Test DesignMediumDidi ChuxingLaunch Small-Lift Checkout FeatureMedium
Next question