Project Context
AMD is considering a new engineering initiative to improve the developer onboarding experience for ROCm by reducing setup friction across supported AI training environments. The proposal has executive attention because GPU software adoption is now a major lever for Instinct accelerator growth, and leadership wants clear success criteria before approving full execution.
You are the Engineering Manager responsible for shaping the initiative before work starts. The core team would include 8 engineers across ROCm installer, documentation, validation, and developer tools, plus shared support from release engineering and DevRel. You have 10 weeks to define the initiative, align stakeholders, and present a launch-ready execution plan for Q3 funding review.
Key Stakeholders
- VP, AI Software wants measurable adoption impact within 2 quarters.
- ROCm Release Engineering wants minimal disruption to the existing release train.
- Developer Relations wants visible improvements for external developers quickly.
- QA/Validation wants realistic quality gates and supported configuration limits.
- Sales for Instinct wants proof this will help unblock enterprise pilots.
These stakeholders disagree on whether success should prioritize faster delivery, broader hardware/OS coverage, or lower support burden.
Constraints
- Budget available for discovery and planning: $180,000
- No new headcount approved before Q4
- Must align to the next ROCm quarterly release in 16 weeks if approved
- Only 3 Linux distributions and 2 Instinct GPU families can be included in initial scope
- Baseline data is incomplete: current setup success is measured only for ~55% of installs
Complications
- A competing internal proposal requests the same installer engineers for a container deployment project.
- Enterprise customers are asking for broader OS support than the team can validate in the first release.
- Existing telemetry is noisy, so early success metrics may be disputed.
Deliverables
- Define pre-launch success criteria for the initiative, including leading and lagging indicators.
- Propose a scoped execution plan that fits the release window and staffing limits.
- Recommend trade-offs on platform coverage, instrumentation, and launch quality bars.
- Identify the main risks to measuring success accurately and how you would mitigate them.
- Outline how you would secure stakeholder alignment before engineering work begins.