Transfer Learning for Defect Detection

Business Context

You’re interviewing with VoltForge, a manufacturer of EV battery modules supplying two major automakers. A missed defect can trigger costly recalls and safety incidents, while too many false alarms slow the production line and increase scrap. VoltForge operates 12 assembly lines, producing ~1.8M modules/month. Each module is photographed at multiple stations; the goal is to flag surface defects (micro-cracks, contamination, misaligned welds) from images.

The challenge: VoltForge has millions of labeled images for an older product line (Gen-2), but only a small labeled dataset for the new line (Gen-3) due to a recent tooling change and a new camera sensor. You must propose and implement a transfer learning approach that achieves strong performance quickly, and explain why it’s better than training from scratch.

Dataset

You are given two datasets:

Dataset	Size	Labeling	Classes	Notes
Gen-2 (source)	4.2M images	High quality, mature labels	5 defect types + “no defect”	Older lighting/camera, stable process
Gen-3 (target)	38K images	Limited labels, some noise	Same label set	New sensor + different reflections

Additional details:

Image resolution: 256×256 RGB crops centered on weld seams.
Target variable: Multi-class classification (6 classes).
Class balance (Gen-3): Imbalanced — ~92% “no defect”; rarest defect ~0.3%.
Missing/corrupt data: ~1–2% corrupted image files; ~8% have partial occlusion due to motion blur.
Domain shift: Gen-3 images have different color temperature and specular highlights.

Success Criteria

Your solution is considered successful if on a held-out Gen-3 test set it achieves:

Macro F1 ≥ 0.70 across the 6 classes (to avoid ignoring rare defects)
Recall ≥ 0.85 on the union of defect classes (safety-driven)
False positive rate ≤ 3% on “no defect” (line throughput constraint)
Inference latency ≤ 30 ms/image on an NVIDIA T4 (batch size 32)

Constraints

You have one 16GB GPU for training and a 48-hour experimentation window.
Model must be deployable in an on-prem environment; avoid overly heavy architectures.
You must be able to explain to manufacturing engineers what transfer learning is doing (high-level interpretability), and how you’ll monitor drift.
You cannot use Gen-3 test labels for any tuning.

Deliverables

Explain transfer learning in this context (what is transferred, why it helps) and compare:
- training from scratch on Gen-3
- using a pretrained ImageNet backbone
- pretraining on Gen-2 then fine-tuning on Gen-3
Propose a training plan (freezing strategy, learning rates, augmentation, class imbalance handling).
Provide a PyTorch implementation that trains a transfer learning model and reports the required metrics.
Describe an evaluation protocol (splits/CV, thresholding, calibration if needed) that avoids leakage.
Outline a production plan: deployment format, monitoring for domain shift, and retraining triggers.

Business Context

Dataset

You are given two datasets:

Dataset	Size	Labeling	Classes	Notes
Gen-2 (source)	4.2M images	High quality, mature labels	5 defect types + “no defect”	Older lighting/camera, stable process
Gen-3 (target)	38K images	Limited labels, some noise	Same label set	New sensor + different reflections

Additional details:

Image resolution: 256×256 RGB crops centered on weld seams.
Target variable: Multi-class classification (6 classes).
Class balance (Gen-3): Imbalanced — ~92% “no defect”; rarest defect ~0.3%.
Missing/corrupt data: ~1–2% corrupted image files; ~8% have partial occlusion due to motion blur.
Domain shift: Gen-3 images have different color temperature and specular highlights.

Success Criteria

Your solution is considered successful if on a held-out Gen-3 test set it achieves:

Macro F1 ≥ 0.70 across the 6 classes (to avoid ignoring rare defects)
Recall ≥ 0.85 on the union of defect classes (safety-driven)
False positive rate ≤ 3% on “no defect” (line throughput constraint)
Inference latency ≤ 30 ms/image on an NVIDIA T4 (batch size 32)

Constraints

You have one 16GB GPU for training and a 48-hour experimentation window.
Model must be deployable in an on-prem environment; avoid overly heavy architectures.
You must be able to explain to manufacturing engineers what transfer learning is doing (high-level interpretability), and how you’ll monitor drift.
You cannot use Gen-3 test labels for any tuning.

Deliverables

Explain transfer learning in this context (what is transferred, why it helps) and compare:
- training from scratch on Gen-3
- using a pretrained ImageNet backbone
- pretraining on Gen-2 then fine-tuning on Gen-3
Propose a training plan (freezing strategy, learning rates, augmentation, class imbalance handling).
Provide a PyTorch implementation that trains a transfer learning model and reports the required metrics.
Describe an evaluation protocol (splits/CV, thresholding, calibration if needed) that avoids leakage.
Outline a production plan: deployment format, monitoring for domain shift, and retraining triggers.

Business Context

Dataset

You are given two datasets:

Dataset	Size	Labeling	Classes	Notes
Gen-2 (source)	4.2M images	High quality, mature labels	5 defect types + “no defect”	Older lighting/camera, stable process
Gen-3 (target)	38K images	Limited labels, some noise	Same label set	New sensor + different reflections

Additional details:

Image resolution: 256×256 RGB crops centered on weld seams.
Target variable: Multi-class classification (6 classes).
Class balance (Gen-3): Imbalanced — ~92% “no defect”; rarest defect ~0.3%.
Missing/corrupt data: ~1–2% corrupted image files; ~8% have partial occlusion due to motion blur.
Domain shift: Gen-3 images have different color temperature and specular highlights.

Success Criteria

Your solution is considered successful if on a held-out Gen-3 test set it achieves:

Macro F1 ≥ 0.70 across the 6 classes (to avoid ignoring rare defects)
Recall ≥ 0.85 on the union of defect classes (safety-driven)
False positive rate ≤ 3% on “no defect” (line throughput constraint)
Inference latency ≤ 30 ms/image on an NVIDIA T4 (batch size 32)

Constraints

You have one 16GB GPU for training and a 48-hour experimentation window.
Model must be deployable in an on-prem environment; avoid overly heavy architectures.
You must be able to explain to manufacturing engineers what transfer learning is doing (high-level interpretability), and how you’ll monitor drift.
You cannot use Gen-3 test labels for any tuning.

Deliverables

Explain transfer learning in this context (what is transferred, why it helps) and compare:
- training from scratch on Gen-3
- using a pretrained ImageNet backbone
- pretraining on Gen-2 then fine-tuning on Gen-3
Propose a training plan (freezing strategy, learning rates, augmentation, class imbalance handling).
Provide a PyTorch implementation that trains a transfer learning model and reports the required metrics.
Describe an evaluation protocol (splits/CV, thresholding, calibration if needed) that avoids leakage.
Outline a production plan: deployment format, monitoring for domain shift, and retraining triggers.

Business Context

Dataset

You are given two datasets:

Dataset	Size	Labeling	Classes	Notes
Gen-2 (source)	4.2M images	High quality, mature labels	5 defect types + “no defect”	Older lighting/camera, stable process
Gen-3 (target)	38K images	Limited labels, some noise	Same label set	New sensor + different reflections

Additional details:

Image resolution: 256×256 RGB crops centered on weld seams.
Target variable: Multi-class classification (6 classes).
Class balance (Gen-3): Imbalanced — ~92% “no defect”; rarest defect ~0.3%.
Missing/corrupt data: ~1–2% corrupted image files; ~8% have partial occlusion due to motion blur.
Domain shift: Gen-3 images have different color temperature and specular highlights.

Success Criteria

Your solution is considered successful if on a held-out Gen-3 test set it achieves:

Macro F1 ≥ 0.70 across the 6 classes (to avoid ignoring rare defects)
Recall ≥ 0.85 on the union of defect classes (safety-driven)
False positive rate ≤ 3% on “no defect” (line throughput constraint)
Inference latency ≤ 30 ms/image on an NVIDIA T4 (batch size 32)

Constraints

You have one 16GB GPU for training and a 48-hour experimentation window.
Model must be deployable in an on-prem environment; avoid overly heavy architectures.
You must be able to explain to manufacturing engineers what transfer learning is doing (high-level interpretability), and how you’ll monitor drift.
You cannot use Gen-3 test labels for any tuning.

Deliverables

Explain transfer learning in this context (what is transferred, why it helps) and compare:
- training from scratch on Gen-3
- using a pretrained ImageNet backbone
- pretraining on Gen-2 then fine-tuning on Gen-3
Propose a training plan (freezing strategy, learning rates, augmentation, class imbalance handling).
Provide a PyTorch implementation that trains a transfer learning model and reports the required metrics.
Describe an evaluation protocol (splits/CV, thresholding, calibration if needed) that avoids leakage.
Outline a production plan: deployment format, monitoring for domain shift, and retraining triggers.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Transfer Learning for Defect Detection

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Transfer Learning for Defect Detection

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Transfer Learning for Defect Detection

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer