ZendeskAssist, a SaaS customer support platform, wants to adapt a pre-trained language model to classify incoming support tickets into routing categories. The team is deciding when supervised fine-tuning is the right approach instead of prompt-based inference or a simpler TF-IDF baseline.
You have 180,000 historical English support tickets labeled by agents into 6 classes: Billing, Bug Report, Account Access, Feature Request, Cancellation, and Other. Ticket length ranges from 8 to 900 tokens, with a median of 95 tokens. Labels are moderately imbalanced: Bug Report (31%), Billing (22%), Account Access (18%), Feature Request (14%), Cancellation (9%), Other (6%). About 4% of records contain noisy labels due to inconsistent agent tagging.
A good solution should explain supervised fine-tuning clearly, identify when it is preferable, and implement a modern training pipeline that achieves at least 0.84 macro-F1 and 0.92 recall on Account Access and Cancellation tickets. Inference latency should remain below 120 ms per ticket in batch serving.