Business Context
You’re interviewing for an ML engineering role on the on-vehicle perception team at a major OEM building an L2+/L3 ADAS stack. The current camera-based pedestrian detection subsystem is accurate in offline evaluation, but it misses real-time deadlines on the production ECU, causing frame drops and delayed braking decisions. Safety and regulatory scrutiny are high: a regression in pedestrian recall can create unacceptable risk.
Your task is to propose and implement a model optimization plan that ensures real-time performance on a vehicle while maintaining detection quality.
Dataset
You are given a curated dataset derived from fleet logs:
- Source: Forward-facing 1920×1080 RGB camera (30 FPS), diverse weather/lighting, urban/suburban.
- Labeling: Pedestrian bounding boxes + “near-miss” hard negatives (e.g., poles, cyclists, strollers).
- Training objective for this interview: simplify to binary classification on cropped proposals (pedestrian vs non-pedestrian), representing the second stage of a detector.
| Feature Group | Count / Shape | Examples |
|---|
| Image crop | 3×128×64 | ROI crops from a proposal generator (fixed aspect ratio) |
| Metadata (optional) | 6 numeric | speed_mps, yaw_rate, exposure_ms, time_of_day_sin/cos |
| Target | 1 | pedestrian (1) vs background (0) |
- Size: 12.5M crops (approx. 4 TB raw), collected from ~80K drives
- Class balance: ~2.5% positive (pedestrian), 97.5% negative
- Missing data: ~8% missing metadata due to sensor dropouts; images always present
Success Criteria
You must meet both quality and runtime targets:
- Quality (offline test set)
- Recall (pedestrian) ≥ 0.92 at FPR ≤ 1% (safety prefers recall)
- AUC-PR ≥ 0.70 (imbalanced data)
- On-vehicle runtime (ECU constraints)
- End-to-end model inference p95 ≤ 12 ms per crop on target hardware (ARM CPU + mobile GPU / NPU)
- Model size ≤ 25 MB (flash + OTA constraints)
- Peak RAM during inference ≤ 200 MB
Constraints & Real-World Details
- The ECU runs multiple real-time tasks; you cannot assume full GPU availability.
- The model must be robust to distribution shift (night/rain/construction zones).
- You must support A/B shadow deployment: run optimized model alongside baseline and compare disagreement rates.
- You cannot change the upstream proposal generator in this interview; focus on optimizing the classifier.
Deliverables
- Optimization strategy: Describe a concrete plan to achieve the latency/memory targets (e.g., architecture changes, quantization, pruning, distillation, batching strategy, operator fusion).
- Training approach: How you handle class imbalance and hard negatives, and how you avoid overfitting to easy backgrounds.
- Evaluation plan: Metrics, threshold selection for the recall/FPR constraint, and how you would validate on-vehicle performance (profiling + correctness checks).
- Implementation: Provide code that trains a baseline model, then applies at least one optimization technique (e.g., quantization-aware training or post-training quantization) and evaluates the quality impact.
- Production rollout: How you would monitor regressions (latency + safety metrics) and roll back safely.