Vehicle Type Classification for Tolling Cameras

Business Context

You’re working on the ML platform team at MetroPass, an electronic tolling operator used across 12 US metro areas. The system processes ~45 million vehicle passages/day from fixed gantry cameras. Vehicle type classification (motorcycle, passenger car, pickup, van, bus, box truck, semi) drives pricing and compliance: misclassifying a semi as a car causes direct revenue loss, while misclassifying a car as a truck creates customer disputes and regulatory scrutiny.

The cameras capture vehicles at varying speeds and angles. Images include night scenes, rain, motion blur, partial occlusion (other vehicles), and occasional lens dirt. You need to implement a model that classifies each passage into one of 7 vehicle classes.

Dataset

You have a labeled dataset built from manual review and cross-checks with weigh-in-motion sensors.

Aspect	Details
Size	3.2M images (JPEG), collected over 9 months from 1,800 gantries
Resolution	Variable; typically 1280×720; vehicle occupies 15–70% of frame
Labels	7 classes: motorcycle, car, pickup, van, bus, box_truck, semi
Class balance	Long-tailed: car 62%, pickup 14%, van 9%, box_truck 6%, semi 5%, bus 3%, motorcycle 1%
Splits	Must be gantry-disjoint (no camera overlap between train/val/test) to avoid leakage
Metadata	timestamp, gantry_id, lane_id, weather (noisy), speed_estimate
Missing/dirty data	~2% corrupted images; ~6% labels suspected noisy (audits show confusion between van/box_truck and pickup/van)

Success Criteria

Macro F1 ≥ 0.86 on the held-out test set (gantry-disjoint).
Recall(semi) ≥ 0.93 and Recall(box_truck) ≥ 0.88 (revenue-critical classes).
p95 inference latency ≤ 50 ms per image on an NVIDIA T4 (batch size 1), including preprocessing.
Model must be stable across conditions: night vs day gap in macro F1 ≤ 0.04.

Constraints

Deployment: real-time scoring in an edge service at each metro region; intermittent connectivity.
Compute: single T4 per region for inference; training can use up to 4×A100 for 24 hours.
Explainability: not full interpretability, but you must provide error analysis and a plan to mitigate systematic failures (e.g., night blur, occlusions).
Data leakage risk: same vehicle may appear multiple times; gantry-disjoint split is required, and you should discuss whether additional grouping is needed.

Deliverables (what you must produce in the interview)

Propose an end-to-end approach (model architecture, preprocessing, augmentation, training loop).
Describe how you will handle class imbalance and label noise.
Define a robust evaluation protocol (splits, metrics, thresholding if applicable).
Provide a production-ready implementation sketch (PyTorch preferred) including training + evaluation.
Outline deployment considerations: model size, latency optimizations (e.g., mixed precision, TorchScript/ONNX), and monitoring (drift, per-gantry performance).

Business Context

Dataset

You have a labeled dataset built from manual review and cross-checks with weigh-in-motion sensors.

Aspect	Details
Size	3.2M images (JPEG), collected over 9 months from 1,800 gantries
Resolution	Variable; typically 1280×720; vehicle occupies 15–70% of frame
Labels	7 classes: motorcycle, car, pickup, van, bus, box_truck, semi
Class balance	Long-tailed: car 62%, pickup 14%, van 9%, box_truck 6%, semi 5%, bus 3%, motorcycle 1%
Splits	Must be gantry-disjoint (no camera overlap between train/val/test) to avoid leakage
Metadata	timestamp, gantry_id, lane_id, weather (noisy), speed_estimate
Missing/dirty data	~2% corrupted images; ~6% labels suspected noisy (audits show confusion between van/box_truck and pickup/van)

Success Criteria

Macro F1 ≥ 0.86 on the held-out test set (gantry-disjoint).
Recall(semi) ≥ 0.93 and Recall(box_truck) ≥ 0.88 (revenue-critical classes).
p95 inference latency ≤ 50 ms per image on an NVIDIA T4 (batch size 1), including preprocessing.
Model must be stable across conditions: night vs day gap in macro F1 ≤ 0.04.

Constraints

Deployment: real-time scoring in an edge service at each metro region; intermittent connectivity.
Compute: single T4 per region for inference; training can use up to 4×A100 for 24 hours.
Explainability: not full interpretability, but you must provide error analysis and a plan to mitigate systematic failures (e.g., night blur, occlusions).
Data leakage risk: same vehicle may appear multiple times; gantry-disjoint split is required, and you should discuss whether additional grouping is needed.

Deliverables (what you must produce in the interview)

Propose an end-to-end approach (model architecture, preprocessing, augmentation, training loop).
Describe how you will handle class imbalance and label noise.
Define a robust evaluation protocol (splits, metrics, thresholding if applicable).
Provide a production-ready implementation sketch (PyTorch preferred) including training + evaluation.
Outline deployment considerations: model size, latency optimizations (e.g., mixed precision, TorchScript/ONNX), and monitoring (drift, per-gantry performance).

Business Context

Dataset

You have a labeled dataset built from manual review and cross-checks with weigh-in-motion sensors.

Aspect	Details
Size	3.2M images (JPEG), collected over 9 months from 1,800 gantries
Resolution	Variable; typically 1280×720; vehicle occupies 15–70% of frame
Labels	7 classes: motorcycle, car, pickup, van, bus, box_truck, semi
Class balance	Long-tailed: car 62%, pickup 14%, van 9%, box_truck 6%, semi 5%, bus 3%, motorcycle 1%
Splits	Must be gantry-disjoint (no camera overlap between train/val/test) to avoid leakage
Metadata	timestamp, gantry_id, lane_id, weather (noisy), speed_estimate
Missing/dirty data	~2% corrupted images; ~6% labels suspected noisy (audits show confusion between van/box_truck and pickup/van)

Success Criteria

Macro F1 ≥ 0.86 on the held-out test set (gantry-disjoint).
Recall(semi) ≥ 0.93 and Recall(box_truck) ≥ 0.88 (revenue-critical classes).
p95 inference latency ≤ 50 ms per image on an NVIDIA T4 (batch size 1), including preprocessing.
Model must be stable across conditions: night vs day gap in macro F1 ≤ 0.04.

Constraints

Deployment: real-time scoring in an edge service at each metro region; intermittent connectivity.
Compute: single T4 per region for inference; training can use up to 4×A100 for 24 hours.
Explainability: not full interpretability, but you must provide error analysis and a plan to mitigate systematic failures (e.g., night blur, occlusions).
Data leakage risk: same vehicle may appear multiple times; gantry-disjoint split is required, and you should discuss whether additional grouping is needed.

Deliverables (what you must produce in the interview)

Propose an end-to-end approach (model architecture, preprocessing, augmentation, training loop).
Describe how you will handle class imbalance and label noise.
Define a robust evaluation protocol (splits, metrics, thresholding if applicable).
Provide a production-ready implementation sketch (PyTorch preferred) including training + evaluation.
Outline deployment considerations: model size, latency optimizations (e.g., mixed precision, TorchScript/ONNX), and monitoring (drift, per-gantry performance).

Business Context

Dataset

You have a labeled dataset built from manual review and cross-checks with weigh-in-motion sensors.

Aspect	Details
Size	3.2M images (JPEG), collected over 9 months from 1,800 gantries
Resolution	Variable; typically 1280×720; vehicle occupies 15–70% of frame
Labels	7 classes: motorcycle, car, pickup, van, bus, box_truck, semi
Class balance	Long-tailed: car 62%, pickup 14%, van 9%, box_truck 6%, semi 5%, bus 3%, motorcycle 1%
Splits	Must be gantry-disjoint (no camera overlap between train/val/test) to avoid leakage
Metadata	timestamp, gantry_id, lane_id, weather (noisy), speed_estimate
Missing/dirty data	~2% corrupted images; ~6% labels suspected noisy (audits show confusion between van/box_truck and pickup/van)

Success Criteria

Macro F1 ≥ 0.86 on the held-out test set (gantry-disjoint).
Recall(semi) ≥ 0.93 and Recall(box_truck) ≥ 0.88 (revenue-critical classes).
p95 inference latency ≤ 50 ms per image on an NVIDIA T4 (batch size 1), including preprocessing.
Model must be stable across conditions: night vs day gap in macro F1 ≤ 0.04.

Constraints

Deployment: real-time scoring in an edge service at each metro region; intermittent connectivity.
Compute: single T4 per region for inference; training can use up to 4×A100 for 24 hours.
Explainability: not full interpretability, but you must provide error analysis and a plan to mitigate systematic failures (e.g., night blur, occlusions).
Data leakage risk: same vehicle may appear multiple times; gantry-disjoint split is required, and you should discuss whether additional grouping is needed.

Deliverables (what you must produce in the interview)

Propose an end-to-end approach (model architecture, preprocessing, augmentation, training loop).
Describe how you will handle class imbalance and label noise.
Define a robust evaluation protocol (splits, metrics, thresholding if applicable).
Provide a production-ready implementation sketch (PyTorch preferred) including training + evaluation.
Outline deployment considerations: model size, latency optimizations (e.g., mixed precision, TorchScript/ONNX), and monitoring (drift, per-gantry performance).

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables (what you must produce in the interview)

Vehicle Type Classification for Tolling Cameras

Business Context

Dataset

Success Criteria

Constraints

Deliverables (what you must produce in the interview)

Your Answer

Vehicle Type Classification for Tolling Cameras

Business Context

Dataset

Success Criteria

Constraints

Deliverables (what you must produce in the interview)

Vehicle Type Classification for Tolling Cameras

Business Context

Dataset

Success Criteria

Constraints

Deliverables (what you must produce in the interview)

Your Answer