Project Background
DoorDash is piloting an AI-assisted support tool for merchant support agents to reduce average handle time and improve case resolution quality. The pilot is already built and ready to run with 40 agents across two regional support hubs, but leadership wants a clear definition of success before approving a broader rollout to 1,200 agents.
You are the program manager responsible for pilot execution and scale-readiness planning. The pilot must run for 6 weeks and produce a recommendation for whether to scale, extend, or stop the program before the next quarterly planning cycle.
Key Stakeholders
The VP of Support wants measurable productivity gains this quarter. The Head of Quality is concerned that faster handling could reduce resolution accuracy. The Engineering Manager wants to avoid scaling a tool that still creates workflow instability. Finance will only approve broader rollout if the pilot shows a credible path to at least $1.5M in annualized savings.
Constraints
- Pilot duration is fixed at 6 weeks
- Budget remaining is $180,000
- Team is 8 people: 3 engineers, 1 data analyst, 1 PM, 1 operations lead, 2 QA specialists
- No additional headcount can be added before scale decision
- Tool can only be enabled for English-language merchant support tickets during the pilot
- Legal requires human review on all refund-related recommendations
Complications
- Two weeks into the pilot, agents report that the tool is helpful for simple tickets but unreliable for policy-edge cases.
- The VP of Support is pushing to declare success early if handle time improves by 10%, even if quality metrics are still inconclusive.
- Engineering has identified a latency issue that affects 8% of pilot sessions during peak hours.
Your Task
- Define the pilot success criteria, including primary metrics, guardrails, and scale gates.
- Create an execution plan for running the pilot and making a scale/no-scale decision.
- Explain how you would handle conflicting stakeholder definitions of success.
- Propose a risk mitigation and rollback approach if pilot results are mixed.
- Recommend whether broad rollout should be full, phased, or delayed based on your framework.