Figma has just completed a 4-month engineering project to migrate its multiplayer collaboration service to a new real-time infrastructure. The migration cost 18 engineer-months and was justified by goals to improve reliability, reduce latency, lower cloud costs, and support future growth. Two weeks after rollout, leadership asks whether the project was actually successful.
Before the migration, the collaboration service supported 2.4M weekly active editors, p95 sync latency was 420 ms, crash-free collaboration sessions were 97.8%, monthly infra cost was $1.9M, and incidents averaged 6 per month. In the first full month after launch, weekly active editors were 2.45M, p95 sync latency improved to 260 ms, crash-free sessions rose to 98.9%, monthly infra cost fell to $1.55M, but the rate of document-save conflicts increased from 0.30% to 0.55% of editing sessions. Product leadership also notes that 7-day retention for new collaborative teams stayed flat at 41%.
collab_sessions: session_id, doc_id, user_id, start_ts, end_ts, sync_latency_ms_p50/p95, save_conflict_flag, crash_flagservice_incidents: incident_id, severity, start_ts, end_ts, root_cause, affected_regioninfra_cost_daily: date, service_name, compute_cost, storage_cost, network_costteam_retention: team_id, created_date, day_7_retained, day_30_retained, seatsfeature_adoption: team_id, collaborative_edits, comments, multiplayer_sessions