You are an engineering manager for a B2B design and collaboration platform. In the last 14 days, customer support tickets mentioning failed file saves rose from 0.8% of weekly active accounts to 2.6%, and three enterprise customers escalated the issue after a recent desktop release. At the same time, overall weekly active accounts were flat and 7-day retention dipped from 91% to 88% for accounts using the affected workflow. Leadership wants to know whether this is an isolated set of customer-specific incidents or a broader product problem that requires an urgent rollback.
How would you use product and support metrics to determine whether the issue is a one-off or systemic problem, and how would you decide whether to treat it as a localized customer escalation or a platform-wide reliability issue?