Predict Equipment Failure for Maintenance

Business Context

VoltForge Manufacturing operates 1,200 CNC machines across 8 plants and loses significant revenue from unplanned downtime. The operations team wants a predictive maintenance model that flags machines likely to fail within the next 7 days so maintenance can be scheduled proactively.

Dataset

You are given machine-level daily records collected over 24 months.

Feature Group	Count	Examples
Sensor statistics	18	vibration_mean, motor_temp_max, pressure_std, spindle_current_p95
Usage and load	9	runtime_hours, cycles_completed, load_factor, shift_count
Maintenance history	7	days_since_last_service, parts_replaced_30d, prior_failures_90d
Machine metadata	6	machine_type, plant_id, manufacturer, install_age_days
Environmental	5	ambient_temp, humidity, dust_index, coolant_quality

Size: 420K machine-days, 45 features
Target: Binary label indicating whether the machine experiences a failure event in the next 7 days
Class balance: 4.6% positive, 95.4% negative
Missing data: ~12% missing in some sensor feeds due to telemetry dropouts; 6% missing in maintenance logs for older plants

Success Criteria

A good solution should achieve strong early-warning performance: PR-AUC above 0.45, recall above 75% at precision above 35%, and provide interpretable drivers so plant engineers can trust the alerts.

Constraints

Predictions run as a daily batch job before the morning maintenance planning meeting
Scoring all machines must complete in under 10 minutes
The solution must be reasonably interpretable for reliability engineers
False positives are acceptable up to a point, but excessive alerts will overwhelm maintenance crews

Deliverables

Build a binary classification model to predict failures within 7 days.
Explain model choice, preprocessing, and how you handle missing data and class imbalance.
Create useful engineered features from sensor, usage, and maintenance history.
Evaluate the model using metrics appropriate for imbalanced classification.
Recommend a decision threshold and describe how the model would be deployed and monitored in production.

Business Context

Dataset

You are given machine-level daily records collected over 24 months.

Feature Group	Count	Examples
Sensor statistics	18	vibration_mean, motor_temp_max, pressure_std, spindle_current_p95
Usage and load	9	runtime_hours, cycles_completed, load_factor, shift_count
Maintenance history	7	days_since_last_service, parts_replaced_30d, prior_failures_90d
Machine metadata	6	machine_type, plant_id, manufacturer, install_age_days
Environmental	5	ambient_temp, humidity, dust_index, coolant_quality

Size: 420K machine-days, 45 features
Target: Binary label indicating whether the machine experiences a failure event in the next 7 days
Class balance: 4.6% positive, 95.4% negative
Missing data: ~12% missing in some sensor feeds due to telemetry dropouts; 6% missing in maintenance logs for older plants

Success Criteria

Constraints

Predictions run as a daily batch job before the morning maintenance planning meeting
Scoring all machines must complete in under 10 minutes
The solution must be reasonably interpretable for reliability engineers
False positives are acceptable up to a point, but excessive alerts will overwhelm maintenance crews

Deliverables

Build a binary classification model to predict failures within 7 days.
Explain model choice, preprocessing, and how you handle missing data and class imbalance.
Create useful engineered features from sensor, usage, and maintenance history.
Evaluate the model using metrics appropriate for imbalanced classification.
Recommend a decision threshold and describe how the model would be deployed and monitored in production.

Business Context

Dataset

You are given machine-level daily records collected over 24 months.

Feature Group	Count	Examples
Sensor statistics	18	vibration_mean, motor_temp_max, pressure_std, spindle_current_p95
Usage and load	9	runtime_hours, cycles_completed, load_factor, shift_count
Maintenance history	7	days_since_last_service, parts_replaced_30d, prior_failures_90d
Machine metadata	6	machine_type, plant_id, manufacturer, install_age_days
Environmental	5	ambient_temp, humidity, dust_index, coolant_quality

Size: 420K machine-days, 45 features
Target: Binary label indicating whether the machine experiences a failure event in the next 7 days
Class balance: 4.6% positive, 95.4% negative
Missing data: ~12% missing in some sensor feeds due to telemetry dropouts; 6% missing in maintenance logs for older plants

Success Criteria

Constraints

Predictions run as a daily batch job before the morning maintenance planning meeting
Scoring all machines must complete in under 10 minutes
The solution must be reasonably interpretable for reliability engineers
False positives are acceptable up to a point, but excessive alerts will overwhelm maintenance crews

Deliverables

Build a binary classification model to predict failures within 7 days.
Explain model choice, preprocessing, and how you handle missing data and class imbalance.
Create useful engineered features from sensor, usage, and maintenance history.
Evaluate the model using metrics appropriate for imbalanced classification.
Recommend a decision threshold and describe how the model would be deployed and monitored in production.

Business Context

Dataset

You are given machine-level daily records collected over 24 months.

Feature Group	Count	Examples
Sensor statistics	18	vibration_mean, motor_temp_max, pressure_std, spindle_current_p95
Usage and load	9	runtime_hours, cycles_completed, load_factor, shift_count
Maintenance history	7	days_since_last_service, parts_replaced_30d, prior_failures_90d
Machine metadata	6	machine_type, plant_id, manufacturer, install_age_days
Environmental	5	ambient_temp, humidity, dust_index, coolant_quality

Size: 420K machine-days, 45 features
Target: Binary label indicating whether the machine experiences a failure event in the next 7 days
Class balance: 4.6% positive, 95.4% negative
Missing data: ~12% missing in some sensor feeds due to telemetry dropouts; 6% missing in maintenance logs for older plants

Success Criteria

Constraints

Predictions run as a daily batch job before the morning maintenance planning meeting
Scoring all machines must complete in under 10 minutes
The solution must be reasonably interpretable for reliability engineers
False positives are acceptable up to a point, but excessive alerts will overwhelm maintenance crews

Deliverables

Build a binary classification model to predict failures within 7 days.
Explain model choice, preprocessing, and how you handle missing data and class imbalance.
Create useful engineered features from sensor, usage, and maintenance history.
Evaluate the model using metrics appropriate for imbalanced classification.
Recommend a decision threshold and describe how the model would be deployed and monitored in production.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Predict Equipment Failure for Maintenance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Predict Equipment Failure for Maintenance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Predict Equipment Failure for Maintenance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer