Optimize On-Vehicle Pedestrian Classifier Latency

Business Context

You’re interviewing for an ML engineering role on the on-vehicle perception team at a major OEM building an L2+/L3 ADAS stack. The current camera-based pedestrian detection subsystem is accurate in offline evaluation, but it misses real-time deadlines on the production ECU, causing frame drops and delayed braking decisions. Safety and regulatory scrutiny are high: a regression in pedestrian recall can create unacceptable risk.

Your task is to propose and implement a model optimization plan that ensures real-time performance on a vehicle while maintaining detection quality.

Dataset

You are given a curated dataset derived from fleet logs:

Source: Forward-facing 1920×1080 RGB camera (30 FPS), diverse weather/lighting, urban/suburban.
Labeling: Pedestrian bounding boxes + “near-miss” hard negatives (e.g., poles, cyclists, strollers).
Training objective for this interview: simplify to binary classification on cropped proposals (pedestrian vs non-pedestrian), representing the second stage of a detector.

Feature Group	Count / Shape	Examples
Image crop	3×128×64	ROI crops from a proposal generator (fixed aspect ratio)
Metadata (optional)	6 numeric	speed_mps, yaw_rate, exposure_ms, time_of_day_sin/cos
Target	1	pedestrian (1) vs background (0)

Size: 12.5M crops (approx. 4 TB raw), collected from ~80K drives
Class balance: ~2.5% positive (pedestrian), 97.5% negative
Missing data: ~8% missing metadata due to sensor dropouts; images always present

Success Criteria

You must meet both quality and runtime targets:

Quality (offline test set)
- Recall (pedestrian) ≥ 0.92 at FPR ≤ 1% (safety prefers recall)
- AUC-PR ≥ 0.70 (imbalanced data)
On-vehicle runtime (ECU constraints)
- End-to-end model inference p95 ≤ 12 ms per crop on target hardware (ARM CPU + mobile GPU / NPU)
- Model size ≤ 25 MB (flash + OTA constraints)
- Peak RAM during inference ≤ 200 MB

Constraints & Real-World Details

The ECU runs multiple real-time tasks; you cannot assume full GPU availability.
The model must be robust to distribution shift (night/rain/construction zones).
You must support A/B shadow deployment: run optimized model alongside baseline and compare disagreement rates.
You cannot change the upstream proposal generator in this interview; focus on optimizing the classifier.

Deliverables

Optimization strategy: Describe a concrete plan to achieve the latency/memory targets (e.g., architecture changes, quantization, pruning, distillation, batching strategy, operator fusion).
Training approach: How you handle class imbalance and hard negatives, and how you avoid overfitting to easy backgrounds.
Evaluation plan: Metrics, threshold selection for the recall/FPR constraint, and how you would validate on-vehicle performance (profiling + correctness checks).
Implementation: Provide code that trains a baseline model, then applies at least one optimization technique (e.g., quantization-aware training or post-training quantization) and evaluates the quality impact.
Production rollout: How you would monitor regressions (latency + safety metrics) and roll back safely.

Business Context

Your task is to propose and implement a model optimization plan that ensures real-time performance on a vehicle while maintaining detection quality.

Dataset

You are given a curated dataset derived from fleet logs:

Source: Forward-facing 1920×1080 RGB camera (30 FPS), diverse weather/lighting, urban/suburban.
Labeling: Pedestrian bounding boxes + “near-miss” hard negatives (e.g., poles, cyclists, strollers).
Training objective for this interview: simplify to binary classification on cropped proposals (pedestrian vs non-pedestrian), representing the second stage of a detector.

Feature Group	Count / Shape	Examples
Image crop	3×128×64	ROI crops from a proposal generator (fixed aspect ratio)
Metadata (optional)	6 numeric	speed_mps, yaw_rate, exposure_ms, time_of_day_sin/cos
Target	1	pedestrian (1) vs background (0)

Size: 12.5M crops (approx. 4 TB raw), collected from ~80K drives
Class balance: ~2.5% positive (pedestrian), 97.5% negative
Missing data: ~8% missing metadata due to sensor dropouts; images always present

Success Criteria

You must meet both quality and runtime targets:

Quality (offline test set)
- Recall (pedestrian) ≥ 0.92 at FPR ≤ 1% (safety prefers recall)
- AUC-PR ≥ 0.70 (imbalanced data)
On-vehicle runtime (ECU constraints)
- End-to-end model inference p95 ≤ 12 ms per crop on target hardware (ARM CPU + mobile GPU / NPU)
- Model size ≤ 25 MB (flash + OTA constraints)
- Peak RAM during inference ≤ 200 MB

Constraints & Real-World Details

The ECU runs multiple real-time tasks; you cannot assume full GPU availability.
The model must be robust to distribution shift (night/rain/construction zones).
You must support A/B shadow deployment: run optimized model alongside baseline and compare disagreement rates.
You cannot change the upstream proposal generator in this interview; focus on optimizing the classifier.

Deliverables

Optimization strategy: Describe a concrete plan to achieve the latency/memory targets (e.g., architecture changes, quantization, pruning, distillation, batching strategy, operator fusion).
Training approach: How you handle class imbalance and hard negatives, and how you avoid overfitting to easy backgrounds.
Evaluation plan: Metrics, threshold selection for the recall/FPR constraint, and how you would validate on-vehicle performance (profiling + correctness checks).
Implementation: Provide code that trains a baseline model, then applies at least one optimization technique (e.g., quantization-aware training or post-training quantization) and evaluates the quality impact.
Production rollout: How you would monitor regressions (latency + safety metrics) and roll back safely.

Business Context

Your task is to propose and implement a model optimization plan that ensures real-time performance on a vehicle while maintaining detection quality.

Dataset

You are given a curated dataset derived from fleet logs:

Source: Forward-facing 1920×1080 RGB camera (30 FPS), diverse weather/lighting, urban/suburban.
Labeling: Pedestrian bounding boxes + “near-miss” hard negatives (e.g., poles, cyclists, strollers).
Training objective for this interview: simplify to binary classification on cropped proposals (pedestrian vs non-pedestrian), representing the second stage of a detector.

Feature Group	Count / Shape	Examples
Image crop	3×128×64	ROI crops from a proposal generator (fixed aspect ratio)
Metadata (optional)	6 numeric	speed_mps, yaw_rate, exposure_ms, time_of_day_sin/cos
Target	1	pedestrian (1) vs background (0)

Size: 12.5M crops (approx. 4 TB raw), collected from ~80K drives
Class balance: ~2.5% positive (pedestrian), 97.5% negative
Missing data: ~8% missing metadata due to sensor dropouts; images always present

Success Criteria

You must meet both quality and runtime targets:

Quality (offline test set)
- Recall (pedestrian) ≥ 0.92 at FPR ≤ 1% (safety prefers recall)
- AUC-PR ≥ 0.70 (imbalanced data)
On-vehicle runtime (ECU constraints)
- End-to-end model inference p95 ≤ 12 ms per crop on target hardware (ARM CPU + mobile GPU / NPU)
- Model size ≤ 25 MB (flash + OTA constraints)
- Peak RAM during inference ≤ 200 MB

Constraints & Real-World Details

The ECU runs multiple real-time tasks; you cannot assume full GPU availability.
The model must be robust to distribution shift (night/rain/construction zones).
You must support A/B shadow deployment: run optimized model alongside baseline and compare disagreement rates.
You cannot change the upstream proposal generator in this interview; focus on optimizing the classifier.

Deliverables

Optimization strategy: Describe a concrete plan to achieve the latency/memory targets (e.g., architecture changes, quantization, pruning, distillation, batching strategy, operator fusion).
Training approach: How you handle class imbalance and hard negatives, and how you avoid overfitting to easy backgrounds.
Evaluation plan: Metrics, threshold selection for the recall/FPR constraint, and how you would validate on-vehicle performance (profiling + correctness checks).
Implementation: Provide code that trains a baseline model, then applies at least one optimization technique (e.g., quantization-aware training or post-training quantization) and evaluates the quality impact.
Production rollout: How you would monitor regressions (latency + safety metrics) and roll back safely.

Business Context

Your task is to propose and implement a model optimization plan that ensures real-time performance on a vehicle while maintaining detection quality.

Dataset

You are given a curated dataset derived from fleet logs:

Source: Forward-facing 1920×1080 RGB camera (30 FPS), diverse weather/lighting, urban/suburban.
Labeling: Pedestrian bounding boxes + “near-miss” hard negatives (e.g., poles, cyclists, strollers).
Training objective for this interview: simplify to binary classification on cropped proposals (pedestrian vs non-pedestrian), representing the second stage of a detector.

Feature Group	Count / Shape	Examples
Image crop	3×128×64	ROI crops from a proposal generator (fixed aspect ratio)
Metadata (optional)	6 numeric	speed_mps, yaw_rate, exposure_ms, time_of_day_sin/cos
Target	1	pedestrian (1) vs background (0)

Size: 12.5M crops (approx. 4 TB raw), collected from ~80K drives
Class balance: ~2.5% positive (pedestrian), 97.5% negative
Missing data: ~8% missing metadata due to sensor dropouts; images always present

Success Criteria

You must meet both quality and runtime targets:

Quality (offline test set)
- Recall (pedestrian) ≥ 0.92 at FPR ≤ 1% (safety prefers recall)
- AUC-PR ≥ 0.70 (imbalanced data)
On-vehicle runtime (ECU constraints)
- End-to-end model inference p95 ≤ 12 ms per crop on target hardware (ARM CPU + mobile GPU / NPU)
- Model size ≤ 25 MB (flash + OTA constraints)
- Peak RAM during inference ≤ 200 MB

Constraints & Real-World Details

The ECU runs multiple real-time tasks; you cannot assume full GPU availability.
The model must be robust to distribution shift (night/rain/construction zones).
You must support A/B shadow deployment: run optimized model alongside baseline and compare disagreement rates.
You cannot change the upstream proposal generator in this interview; focus on optimizing the classifier.

Deliverables

Optimization strategy: Describe a concrete plan to achieve the latency/memory targets (e.g., architecture changes, quantization, pruning, distillation, batching strategy, operator fusion).
Training approach: How you handle class imbalance and hard negatives, and how you avoid overfitting to easy backgrounds.
Evaluation plan: Metrics, threshold selection for the recall/FPR constraint, and how you would validate on-vehicle performance (profiling + correctness checks).
Implementation: Provide code that trains a baseline model, then applies at least one optimization technique (e.g., quantization-aware training or post-training quantization) and evaluates the quality impact.
Production rollout: How you would monitor regressions (latency + safety metrics) and roll back safely.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints & Real-World Details

Deliverables

Optimize On-Vehicle Pedestrian Classifier Latency

Business Context

Dataset

Success Criteria

Constraints & Real-World Details

Deliverables

Your Answer

Optimize On-Vehicle Pedestrian Classifier Latency

Business Context

Dataset

Success Criteria

Constraints & Real-World Details

Deliverables

Optimize On-Vehicle Pedestrian Classifier Latency

Business Context

Dataset

Success Criteria

Constraints & Real-World Details

Deliverables

Your Answer