Project Context
Databricks is preparing a managed data governance launch for a large enterprise customer by enabling a new Unity Catalog capability across their production workspaces. The project involves 11 people across platform engineering, product, field engineering, security, and customer success, and it has become 3 weeks behind a committed launch date. The delay matters because the account is tied to a 7-figure expansion renewal this quarter and the feature is also a reference deployment for similar regulated customers.
Key Stakeholders
The Engineering Director wants a credible recovery plan without creating unsustainable on-call load. The Product Manager wants to preserve the original scope to support roadmap commitments. The Security team will not approve launch without audit logging and access control validation. The account team is pushing for the original customer date because procurement closes in 6 weeks.
Constraints
- Launch deadline: 6 weeks from today
- Current slip: 3 weeks behind the original internal plan
- Team capacity: 6 engineers, 1 EM, 1 PM, 1 security engineer, 1 solutions architect, 1 QA lead
- Budget: $120,000 remaining for contractor QA and customer enablement
- Dependencies: customer IAM integration, audit log export validation, and workspace migration tooling
- Reliability requirement: no Sev-1 incidents during the first 30 days after launch
Complications
- The customer’s IAM team is only available twice per week, slowing integration testing.
- Two senior engineers are spending 30% of their time on production support for another Databricks control plane initiative.
- Early testing shows workspace migration jobs are taking 40% longer than estimated.
Deliverables
- Diagnose the most likely bottlenecks and explain how you would validate them within the first week.
- Produce a 6-week recovery plan with milestones, owners, and explicit trade-offs on scope, sequencing, or quality gates.
- Recommend what you would communicate to the customer, leadership, and internal stakeholders once the delay is confirmed.
- Define launch readiness criteria, rollback conditions, and post-launch monitoring for the first 30 days.
- Identify the top execution risks and how you would mitigate them if the schedule slips again.