Business Context
HelpHive, a SaaS customer support platform handling roughly 1.8 million tickets per year, wants to automatically classify incoming support tickets into issue categories such as billing, login, bug report, feature request, cancellation, and account security. The current rule-based system performs poorly on rare but high-priority classes, especially account security and cancellation.
Dataset
You are given a historical ticket dataset built from the first message in each support conversation.
| Feature Group | Count | Examples |
|---|
| Text features | 1 raw field | subject + first_message |
| Numerical metadata | 11 | customer_tenure_days, prior_ticket_count_90d, sentiment_score, message_length |
| Categorical metadata | 7 | plan_tier, language, channel, region, device_type |
| Temporal features | 4 | hour_of_day, day_of_week, days_since_last_ticket, month |
- Size: 420K labeled tickets, 23 engineered non-text features plus raw text
- Target: Multiclass ticket category (8 classes)
- Class balance: Highly skewed — top 2 classes account for 74% of tickets; smallest class is 1.6%
- Missing data: 12% missing sentiment scores, 7% missing device_type, 3% missing tenure for migrated customers
Success Criteria
A good solution should improve minority-class detection without causing a large drop in overall precision. Target at least 0.72 macro F1, 0.88 weighted F1, and recall above 0.70 for the two rare operationally critical classes: account_security and cancellation.
Constraints
- Inference must complete in <100 ms per ticket in an online API
- Support operations need class-level explanations and feature importance
- Retraining should be feasible on a weekly cadence with moderate cloud cost
- The solution must avoid leakage from future ticket outcomes or agent actions
Deliverables
- Build a multiclass classification pipeline for skewed support ticket data.
- Explain how you handle class imbalance in both training and evaluation.
- Design preprocessing for text, categorical, and numerical features with missing values.
- Choose a validation strategy and justify it.
- Recommend a deployment-ready thresholding or calibration approach for rare classes.
- Report final metrics, confusion patterns, and the main tradeoffs of your design.