ShopFlow, an e-commerce platform, receives thousands of customer support tickets per day across email and chat. The operations team wants an NLP system that can automatically classify each ticket into the correct queue so agents can respond faster and with fewer manual triage errors.
You have 420,000 historical tickets collected over 18 months. Each ticket contains a subject line and message body. Text is primarily English (96%), with small amounts of Spanish and French. Ticket length ranges from 5 to 900 tokens with a median of 78 tokens. Labels are moderately imbalanced across 6 classes: Order Status (28%), Refund/Return (22%), Payment Issue (14%), Account Access (12%), Shipping Damage (9%), and Other (15%). Historical labels were assigned by agents and contain some noise, especially between Refund/Return and Shipping Damage.
A good solution should achieve macro-F1 ≥ 0.84, weighted-F1 ≥ 0.88, and recall ≥ 0.90 for Payment Issue and Account Access because those tickets are time-sensitive. Inference should support near-real-time routing with p95 latency under 150 ms per ticket.