Project Background
Instagram's Android app has seen a rise in memory-related crashes and excessive RAM growth after users spend time in Reels, Stories, and the in-app camera flow. Android Vitals shows low-memory terminations up 18% week-over-week on mid-tier devices, and leadership wants a fix in the next app release because this is affecting session quality in key growth markets.
You are the mobile engineering lead working with 7 engineers (4 Android, 1 iOS counterpart for parity decisions, 1 QA, 1 data engineer), one product manager, and one release manager. The team has 8 weeks before the next major Instagram Android release candidate is cut.
Key Stakeholders
- Instagram Android Engineering Director wants crash reduction this quarter without destabilizing the release.
- Product Manager for Reels does not want feature work paused during a creator monetization launch.
- Release Manager wants a clear go/no-go decision by week 6.
- Meta Performance Infra team can help with profiling and dashboards, but only has bandwidth for one major investigation this half.
Constraints
- Timeline: 8 weeks to ship in the next release train
- Budget: No new headcount; up to $60K for device lab expansion and profiling support
- Device coverage: Must validate on 12 Android device models, including 5 low-memory devices
- Engineering capacity: 2 Android engineers are already committed 50% to the Reels monetization launch
- Dependency: Perfetto/Android Studio profiling support from Meta Performance Infra available starting week 2
Complications
- The root cause is unclear: possible image cache leaks, video player retention, or JNI/native memory growth in camera effects.
- A proposed quick fix reduces memory usage but may increase cold start time by 7%.
- Internal dashboards disagree: one shows heap growth stabilizing, while another shows increased background kills.
Your Task
- Build an 8-week execution plan to diagnose, prioritize, fix, and safely launch memory improvements.
- Define how you would triage likely causes and make trade-offs if all issues cannot be fixed before release.
- Propose launch criteria, monitoring, and rollback thresholds for the release.
- Identify the top risks, owners, and mitigation steps.
- Explain how you would align stakeholders with competing priorities around stability vs. feature delivery.