NorthForge Manufacturing operates 1,200 CNC machines across 14 plants and wants to predict equipment failure 24 hours in advance so maintenance can intervene before unplanned downtime. Failures are rare but expensive, making this a classic imbalanced binary classification problem.
You are given a historical machine telemetry dataset collected at hourly resolution over 18 months.
| Feature Group | Count | Examples |
|---|---|---|
| Sensor readings | 18 | temperature_mean, vibration_rms, pressure_std, spindle_current |
| Usage / load | 9 | operating_hours, load_pct, cycle_count_24h, idle_ratio |
| Maintenance history | 6 | days_since_last_service, prior_failures_90d, component_replaced |
| Machine metadata | 5 | machine_type, plant_id, manufacturer, install_age_days |
| Derived temporal features | 10 | rolling_mean_6h, rolling_std_24h, trend_slope_12h |
failure_24h — whether the machine fails in the next 24 hoursA strong solution should improve substantially over the majority-class baseline and achieve high recall on true failures while keeping false alarms low enough for plant maintenance teams to act on alerts. Aim for recall >= 0.75 with precision >= 0.20, plus strong ranking quality on rare events.