You’re on the autonomy perception team at AuroraRide, a robotaxi operator running 3,500 vehicles across Phoenix and Austin. Each vehicle has a forward-facing 8MP RGB camera (30 FPS) and must obey traffic signals with extremely high reliability. A recent safety review found that the stack occasionally confuses red vs yellow in backlit scenes and misses small, distant lights at complex intersections. A single failure can cause a safety incident and immediate regulatory scrutiny.
Your task is to design an ML system that (1) detects traffic lights in camera images and (2) classifies their state (e.g., red/yellow/green/off/unknown). The system will run on-vehicle and must support real-time inference.
You have access to a labeled dataset collected from the fleet.
| Component | Details |
|---|---|
| Images | 1.8M frames (day/night/rain), 1920×1080 RGB, 30 FPS sequences but labeled per-frame |
| Labels | Bounding boxes for each visible traffic light + state label per box |
| Classes | red, yellow, green, off (unlit), unknown (occluded/ambiguous) |
| Object size | 40% of boxes are < 20×20 px (distant lights); long-tail of tiny objects |
| Class balance | green 52%, red 33%, yellow 7%, off 5%, unknown 3% |
| Domain shift | 20% frames at night; 8% rain; 5% lens flare; new intersections added weekly |
| Missing/Noisy labels | ~2% boxes have incorrect state due to annotation ambiguity; some frames missing boxes for far lights |
You also have a held-out “regression suite”: 12,000 frames from the last 2 weeks at newly mapped intersections (unseen during training).
red vs green (binary), and ≥ 92% macro-F1 across all 5 states.You may assume you can use PyTorch and common detection toolchains, but you must be explicit about the exact metrics, thresholds, and release gates you would use before shipping to the fleet.