Optimize Defect Detection from Images

Business Context

VoltLens manufactures consumer electronics and uses a vision model to detect visible assembly defects from product-line images. The current model misses too many defects and is too slow for reliable deployment on the inspection line.

Dataset

You are given a labeled image classification dataset for a binary computer vision task: defective vs non-defective product images.

Feature Group	Count	Examples
RGB images	1 input tensor	224x224x3 product photos from 6 factory cameras
Metadata	4	camera_id, production_line, shift, product_type
Labels	1 target	defect_present (0/1)

Size: 120K images collected over 9 months
Target: Binary classification — defect present (1) vs no visible defect (0)
Class balance: 8% defective, 92% non-defective
Missing data: ~3% missing metadata fields; image quality varies due to blur, glare, and lighting shifts
Data issues: Near-duplicate frames from the same product and temporal drift after camera recalibration

Success Criteria

A production-ready solution should achieve strong recall on defective items while keeping false alarms manageable. Good enough means:

Recall on defects >= 0.92
Precision >= 0.70 at the selected operating threshold
PR-AUC >= 0.85 on a held-out test set
P95 inference latency < 50 ms/image on a single T4 GPU or equivalent

Constraints

The model will be used in near-real-time on the factory line
False negatives are more costly than false positives
The quality team needs image-level explanations for flagged defects
Retraining budget is limited to a weekly batch job

Deliverables

Build and optimize a computer vision model for defect detection
Define a leakage-safe train/validation/test split strategy
Explain how you handle class imbalance, augmentation, and threshold tuning
Report evaluation metrics and justify the chosen operating point
Propose deployment and monitoring steps for production use

Business Context

Dataset

You are given a labeled image classification dataset for a binary computer vision task: defective vs non-defective product images.

Feature Group	Count	Examples
RGB images	1 input tensor	224x224x3 product photos from 6 factory cameras
Metadata	4	camera_id, production_line, shift, product_type
Labels	1 target	defect_present (0/1)

Size: 120K images collected over 9 months
Target: Binary classification — defect present (1) vs no visible defect (0)
Class balance: 8% defective, 92% non-defective
Missing data: ~3% missing metadata fields; image quality varies due to blur, glare, and lighting shifts
Data issues: Near-duplicate frames from the same product and temporal drift after camera recalibration

Success Criteria

A production-ready solution should achieve strong recall on defective items while keeping false alarms manageable. Good enough means:

Recall on defects >= 0.92
Precision >= 0.70 at the selected operating threshold
PR-AUC >= 0.85 on a held-out test set
P95 inference latency < 50 ms/image on a single T4 GPU or equivalent

Constraints

The model will be used in near-real-time on the factory line
False negatives are more costly than false positives
The quality team needs image-level explanations for flagged defects
Retraining budget is limited to a weekly batch job

Deliverables

Build and optimize a computer vision model for defect detection
Define a leakage-safe train/validation/test split strategy
Explain how you handle class imbalance, augmentation, and threshold tuning
Report evaluation metrics and justify the chosen operating point
Propose deployment and monitoring steps for production use

Business Context

Dataset

You are given a labeled image classification dataset for a binary computer vision task: defective vs non-defective product images.

Feature Group	Count	Examples
RGB images	1 input tensor	224x224x3 product photos from 6 factory cameras
Metadata	4	camera_id, production_line, shift, product_type
Labels	1 target	defect_present (0/1)

Size: 120K images collected over 9 months
Target: Binary classification — defect present (1) vs no visible defect (0)
Class balance: 8% defective, 92% non-defective
Missing data: ~3% missing metadata fields; image quality varies due to blur, glare, and lighting shifts
Data issues: Near-duplicate frames from the same product and temporal drift after camera recalibration

Success Criteria

A production-ready solution should achieve strong recall on defective items while keeping false alarms manageable. Good enough means:

Recall on defects >= 0.92
Precision >= 0.70 at the selected operating threshold
PR-AUC >= 0.85 on a held-out test set
P95 inference latency < 50 ms/image on a single T4 GPU or equivalent

Constraints

The model will be used in near-real-time on the factory line
False negatives are more costly than false positives
The quality team needs image-level explanations for flagged defects
Retraining budget is limited to a weekly batch job

Deliverables

Build and optimize a computer vision model for defect detection
Define a leakage-safe train/validation/test split strategy
Explain how you handle class imbalance, augmentation, and threshold tuning
Report evaluation metrics and justify the chosen operating point
Propose deployment and monitoring steps for production use

Business Context

Dataset

You are given a labeled image classification dataset for a binary computer vision task: defective vs non-defective product images.

Feature Group	Count	Examples
RGB images	1 input tensor	224x224x3 product photos from 6 factory cameras
Metadata	4	camera_id, production_line, shift, product_type
Labels	1 target	defect_present (0/1)

Size: 120K images collected over 9 months
Target: Binary classification — defect present (1) vs no visible defect (0)
Class balance: 8% defective, 92% non-defective
Missing data: ~3% missing metadata fields; image quality varies due to blur, glare, and lighting shifts
Data issues: Near-duplicate frames from the same product and temporal drift after camera recalibration

Success Criteria

A production-ready solution should achieve strong recall on defective items while keeping false alarms manageable. Good enough means:

Recall on defects >= 0.92
Precision >= 0.70 at the selected operating threshold
PR-AUC >= 0.85 on a held-out test set
P95 inference latency < 50 ms/image on a single T4 GPU or equivalent

Constraints

The model will be used in near-real-time on the factory line
False negatives are more costly than false positives
The quality team needs image-level explanations for flagged defects
Retraining budget is limited to a weekly batch job

Deliverables

Build and optimize a computer vision model for defect detection
Define a leakage-safe train/validation/test split strategy
Explain how you handle class imbalance, augmentation, and threshold tuning
Report evaluation metrics and justify the chosen operating point
Propose deployment and monitoring steps for production use

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Optimize Defect Detection from Images

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Optimize Defect Detection from Images

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Optimize Defect Detection from Images

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer