Business Context
ZendeskX routes customer support tickets for a SaaS platform into product, billing, account, and technical queues. The team currently uses an LSTM baseline, but ticket volume has grown and long messages with multiple issues are being misrouted, so they want you to evaluate whether a Transformer-based approach is a better production choice than RNN- or CNN-based text models.
Data
- Volume: 420,000 historical English support tickets
- Text length: 8-1,200 tokens (median: 96)
- Labels: 4 routing classes; mildly imbalanced (technical 41%, billing 24%, account 19%, product 16%)
- Format: Subject + body, with HTML fragments, signatures, URLs, stack traces, and occasional copied chat logs
Success Criteria
A strong solution should achieve macro-F1 >= 0.86, improve recall on long tickets (>256 tokens) versus the LSTM baseline, and keep p95 inference latency under 120 ms per ticket in batch scoring.
Constraints
- Deployment target is a single T4 GPU for training and CPU inference in production
- Model artifact should stay under 500 MB
- Explainability is required at a practical level for misrouted tickets
Requirements
- Build a multi-class text classification pipeline for ticket routing.
- Compare a Transformer approach against an RNN or CNN baseline and explain the architectural differences in practical NLP terms.
- Implement realistic preprocessing for noisy support text.
- Fine-tune a modern Python model and report evaluation metrics by class and by text length bucket.
- Explain why Transformers handle long-range dependencies differently from RNNs and CNNs, and discuss trade-offs in compute, parallelism, and context modeling.