RouteOps manages 18,000 delivery vehicles across North America. The fleet operations team wants a model to predict next-month maintenance cost per vehicle and justify whether a Random Forest is a better choice than Linear Regression for this fleet problem.
The training data contains monthly vehicle-level records aggregated from telematics, maintenance systems, and driver logs.
| Feature Group | Count | Examples |
|---|---|---|
| Vehicle attributes | 8 | vehicle_age_years, make, model, fuel_type, odometer_km |
| Usage patterns | 10 | avg_daily_km, idle_minutes_per_day, harsh_brake_events, route_variability |
| Maintenance history | 7 | repairs_last_90d, days_since_last_service, warranty_flag |
| Environment | 5 | avg_payload_kg, urban_route_pct, avg_temp_c, road_quality_score |
| Driver behavior | 4 | speeding_events, driver_tenure_months, safety_score, night_driving_pct |
A good solution should outperform a simple linear baseline by at least 15% on MAE and explain when nonlinear models are justified. A production-ready answer should also identify the most important cost drivers for fleet managers.