Avenue Code’s delivery operations team wants a lightweight model to predict whether a client delivery ticket will miss its promised SLA. The goal is not only to train a classifier, but to demonstrate strong feature engineering on messy operational data that includes timestamps, categorical fields, and partially missing inputs.
You are given a historical dataset of delivery tickets exported from an Avenue Code internal operations workflow. Each row represents one ticket at the moment it was assigned to a delivery squad.
| Feature Group | Count | Examples |
|---|---|---|
| Numeric operational metrics | 12 | estimated_hours, prior_revisions, team_load, client_tenure_months |
| Categorical attributes | 9 | region, service_line, priority, squad_id, client_segment |
| Temporal fields | 6 | created_at, assigned_at, due_at, day_of_week, hour_of_day |
| Text-derived flags | 4 | has_urgent_keyword, request_length, has_attachment, contains_change_request |
| Target | 1 | sla_missed |
A good solution should improve materially over a raw-feature baseline by using thoughtful feature engineering. Aim for ROC-AUC >= 0.82 and F1 >= 0.60 on the held-out test set, while keeping the pipeline interpretable enough for operations managers to review the main drivers.