Engineer Features for Delivery Delay Prediction

Business Context

Avenue Code’s delivery operations team wants a lightweight model to predict whether a client delivery ticket will miss its promised SLA. The goal is not only to train a classifier, but to demonstrate strong feature engineering on messy operational data that includes timestamps, categorical fields, and partially missing inputs.

Dataset

You are given a historical dataset of delivery tickets exported from an Avenue Code internal operations workflow. Each row represents one ticket at the moment it was assigned to a delivery squad.

Feature Group	Count	Examples
Numeric operational metrics	12	estimated_hours, prior_revisions, team_load, client_tenure_months
Categorical attributes	9	region, service_line, priority, squad_id, client_segment
Temporal fields	6	created_at, assigned_at, due_at, day_of_week, hour_of_day
Text-derived flags	4	has_urgent_keyword, request_length, has_attachment, contains_change_request
Target	1	sla_missed

Size: 48K tickets over 18 months, 31 raw features
Target: Binary classification — whether the ticket missed SLA
Class balance: 21% positive, 79% negative
Missing data: 18% missing in estimated_hours, 11% in client_tenure_months, 7% in text-derived fields

Success Criteria

A good solution should improve materially over a raw-feature baseline by using thoughtful feature engineering. Aim for ROC-AUC >= 0.82 and F1 >= 0.60 on the held-out test set, while keeping the pipeline interpretable enough for operations managers to review the main drivers.

Constraints

Batch scoring only; inference should complete in under 5 minutes for 10K daily tickets
Avoid leakage from future timestamps or post-assignment information
Prefer features that can be recomputed reliably in production
Keep the solution simple enough to maintain by a small ML platform team

Deliverables

Build a reproducible feature engineering pipeline for numeric, categorical, and temporal data.
Train at least one baseline model and one improved model using engineered features.
Explain which engineered features are most useful and why.
Evaluate the model with appropriate classification metrics and threshold selection.
Identify leakage risks and productionization considerations for the feature pipeline.

Business Context

Dataset

You are given a historical dataset of delivery tickets exported from an Avenue Code internal operations workflow. Each row represents one ticket at the moment it was assigned to a delivery squad.

Feature Group	Count	Examples
Numeric operational metrics	12	estimated_hours, prior_revisions, team_load, client_tenure_months
Categorical attributes	9	region, service_line, priority, squad_id, client_segment
Temporal fields	6	created_at, assigned_at, due_at, day_of_week, hour_of_day
Text-derived flags	4	has_urgent_keyword, request_length, has_attachment, contains_change_request
Target	1	sla_missed

Size: 48K tickets over 18 months, 31 raw features
Target: Binary classification — whether the ticket missed SLA
Class balance: 21% positive, 79% negative
Missing data: 18% missing in estimated_hours, 11% in client_tenure_months, 7% in text-derived fields

Success Criteria

Constraints

Batch scoring only; inference should complete in under 5 minutes for 10K daily tickets
Avoid leakage from future timestamps or post-assignment information
Prefer features that can be recomputed reliably in production
Keep the solution simple enough to maintain by a small ML platform team

Deliverables

Build a reproducible feature engineering pipeline for numeric, categorical, and temporal data.
Train at least one baseline model and one improved model using engineered features.
Explain which engineered features are most useful and why.
Evaluate the model with appropriate classification metrics and threshold selection.
Identify leakage risks and productionization considerations for the feature pipeline.

Business Context

Dataset

You are given a historical dataset of delivery tickets exported from an Avenue Code internal operations workflow. Each row represents one ticket at the moment it was assigned to a delivery squad.

Feature Group	Count	Examples
Numeric operational metrics	12	estimated_hours, prior_revisions, team_load, client_tenure_months
Categorical attributes	9	region, service_line, priority, squad_id, client_segment
Temporal fields	6	created_at, assigned_at, due_at, day_of_week, hour_of_day
Text-derived flags	4	has_urgent_keyword, request_length, has_attachment, contains_change_request
Target	1	sla_missed

Size: 48K tickets over 18 months, 31 raw features
Target: Binary classification — whether the ticket missed SLA
Class balance: 21% positive, 79% negative
Missing data: 18% missing in estimated_hours, 11% in client_tenure_months, 7% in text-derived fields

Success Criteria

Constraints

Batch scoring only; inference should complete in under 5 minutes for 10K daily tickets
Avoid leakage from future timestamps or post-assignment information
Prefer features that can be recomputed reliably in production
Keep the solution simple enough to maintain by a small ML platform team

Deliverables

Build a reproducible feature engineering pipeline for numeric, categorical, and temporal data.
Train at least one baseline model and one improved model using engineered features.
Explain which engineered features are most useful and why.
Evaluate the model with appropriate classification metrics and threshold selection.
Identify leakage risks and productionization considerations for the feature pipeline.

Business Context

Dataset

You are given a historical dataset of delivery tickets exported from an Avenue Code internal operations workflow. Each row represents one ticket at the moment it was assigned to a delivery squad.

Feature Group	Count	Examples
Numeric operational metrics	12	estimated_hours, prior_revisions, team_load, client_tenure_months
Categorical attributes	9	region, service_line, priority, squad_id, client_segment
Temporal fields	6	created_at, assigned_at, due_at, day_of_week, hour_of_day
Text-derived flags	4	has_urgent_keyword, request_length, has_attachment, contains_change_request
Target	1	sla_missed

Size: 48K tickets over 18 months, 31 raw features
Target: Binary classification — whether the ticket missed SLA
Class balance: 21% positive, 79% negative
Missing data: 18% missing in estimated_hours, 11% in client_tenure_months, 7% in text-derived fields

Success Criteria

Constraints

Batch scoring only; inference should complete in under 5 minutes for 10K daily tickets
Avoid leakage from future timestamps or post-assignment information
Prefer features that can be recomputed reliably in production
Keep the solution simple enough to maintain by a small ML platform team

Deliverables

Build a reproducible feature engineering pipeline for numeric, categorical, and temporal data.
Train at least one baseline model and one improved model using engineered features.
Explain which engineered features are most useful and why.
Evaluate the model with appropriate classification metrics and threshold selection.
Identify leakage risks and productionization considerations for the feature pipeline.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Engineer Features for Delivery Delay Prediction

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Engineer Features for Delivery Delay Prediction

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Engineer Features for Delivery Delay Prediction

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer