Product Context
RideWave is a consumer mobile app that lets users share live trips with friends, estimate arrival times, and receive safety alerts when a tracked device deviates from its expected route. The ML system must turn noisy mobile GPS pings into accurate real-time location state and short-horizon movement predictions.
Scale
| Signal | Value |
|---|
| DAU | 35M |
| Concurrent tracked devices at peak | 4.5M |
| Peak ingest QPS (location pings) | 220K events/sec |
| Average ping frequency | every 3-8 seconds while active |
| Geofence / route candidates per request | 100-5,000 depending on density |
| p99 latency budget for live update | 150ms end-to-end |
Task
Design an end-to-end ML system for real-time location tracking in the mobile app. Your design should address:
- How you would define the prediction tasks and success metrics for live tracking, map matching, and short-term ETA / route deviation detection
- The full architecture from mobile event ingestion to online inference, including any multi-stage pipeline such as candidate road-segment retrieval, ranking, and re-ranking / smoothing
- What features and labels you would use, and how you would build batch + streaming pipelines without introducing training-serving skew
- How models are trained, deployed, refreshed, and evaluated offline and online
- How you would monitor the system and handle major failure modes at scale
Constraints
- Mobile GPS is noisy, sparse indoors, and can be delayed or arrive out of order
- Battery cost matters: the app cannot increase GPS sampling aggressively for all users
- The system must support regional data residency requirements and minimize retention of precise raw coordinates
- Freshness matters: route deviation alerts should trigger within seconds, but false positives are costly
- Cost target: online inference and feature serving should remain efficient enough to support sustained 220K events/sec at peak