Project Background
Databricks is preparing a strategic customer-facing migration from legacy workspace-level access controls to Unity Catalog for a large enterprise account running regulated workloads across 18 workspaces on AWS. The customer has inconsistent infrastructure-as-code, limited documentation, and several undocumented data access paths through jobs, clusters, and external locations. You are the DevOps Engineer responsible for driving execution on a 10-week migration plan despite unclear ownership, incomplete inventory, and no prior migration playbook for this customer shape.
Key Stakeholders
The account team wants the migration completed before the customer’s annual renewal in 10 weeks. The customer platform team wants minimal disruption to production jobs in Databricks Workflows and SQL Warehouses. Databricks Security and Support want strict controls, auditability, and a rollback path before approving cutover. The Engineering manager wants to avoid pulling more than 2 additional engineers from other roadmap work.
Constraints
- Timeline: 10 weeks to production cutover
- Team: 1 DevOps Engineer, 2 platform engineers, 1 security engineer, 1 TAM, shared support from 1 SRE at 25%
- Budget: $120K for temporary contractor support and testing environments
- Scope: 18 workspaces, 420 jobs, 65 SQL dashboards, 14 external locations, 9 critical production pipelines
- Downtime tolerance: less than 30 minutes for critical workloads
Complications
- The customer cannot provide a reliable inventory of all service principals, cluster policies, and downstream consumers in the first 2 weeks.
- Two critical pipelines use undocumented table permissions that may break when moved to Unity Catalog.
- A VP-level customer sponsor is asking for an accelerated cutover of the highest-visibility workspace by Week 5.
Your Task
- Build a phased execution plan for discovery, dependency mapping, migration, validation, and cutover.
- Define how you would make trade-offs between speed, migration completeness, and operational risk.
- Propose stakeholder communication, escalation points, and decision checkpoints.
- Create success metrics, a rollback plan, and launch-readiness criteria.
- Identify the top risks created by ambiguity and how you would mitigate them.