You’re working on the ML platform team at MetroPass, an electronic tolling operator used across 12 US metro areas. The system processes ~45 million vehicle passages/day from fixed gantry cameras. Vehicle type classification (motorcycle, passenger car, pickup, van, bus, box truck, semi) drives pricing and compliance: misclassifying a semi as a car causes direct revenue loss, while misclassifying a car as a truck creates customer disputes and regulatory scrutiny.
The cameras capture vehicles at varying speeds and angles. Images include night scenes, rain, motion blur, partial occlusion (other vehicles), and occasional lens dirt. You need to implement a model that classifies each passage into one of 7 vehicle classes.
You have a labeled dataset built from manual review and cross-checks with weigh-in-motion sensors.
| Aspect | Details |
|---|---|
| Size | 3.2M images (JPEG), collected over 9 months from 1,800 gantries |
| Resolution | Variable; typically 1280×720; vehicle occupies 15–70% of frame |
| Labels | 7 classes: motorcycle, car, pickup, van, bus, box_truck, semi |
| Class balance | Long-tailed: car 62%, pickup 14%, van 9%, box_truck 6%, semi 5%, bus 3%, motorcycle 1% |
| Splits | Must be gantry-disjoint (no camera overlap between train/val/test) to avoid leakage |
| Metadata | timestamp, gantry_id, lane_id, weather (noisy), speed_estimate |
| Missing/dirty data | ~2% corrupted images; ~6% labels suspected noisy (audits show confusion between van/box_truck and pickup/van) |