Choose Fine-Tuning for Support Routing

Business Context

ZendeskPro wants to automate routing of inbound enterprise support tickets. The team is deciding whether to deploy a prompt-based large language model using in-context learning or fine-tune a smaller transformer for stable production classification.

Data

You have 420,000 historical English support tickets labeled into 6 routing queues: Billing, Technical Bug, Account Access, Feature Request, Compliance, and Sales. Ticket bodies range from 10 to 900 words (median 120), often include pasted logs, order IDs, URLs, and product names. Class distribution is moderately imbalanced: Technical Bug 34%, Billing 22%, Account Access 18%, Feature Request 14%, Compliance 7%, Sales 5%. About 3% of labels are noisy due to manual reassignment after first triage.

Success Criteria

A production-ready recommendation should achieve at least 0.88 macro-F1, keep p95 inference latency below 150 ms per ticket, and produce consistent predictions across repeated runs. The solution should also explain when in-context learning is still preferable despite lower consistency.

Constraints

Daily volume: ~60,000 tickets
Deployment on a single A10G GPU or CPU batch inference
Customer data cannot be sent to third-party APIs
Weekly label refreshes are available

Requirements

Build a baseline in-context learning approach and a fine-tuned transformer classifier.
Define a realistic preprocessing pipeline for noisy support text.
Compare the two approaches on accuracy, latency, cost, and operational stability.
Explain when you would choose fine-tuning versus in-context learning for this task.
Provide Python code for preprocessing, training, inference, and evaluation.
Recommend a final production approach and justify trade-offs clearly.

Data

Requirements

Build a baseline in-context learning approach and a fine-tuned transformer classifier.

Define a realistic preprocessing pipeline for noisy support text.

Compare the two approaches on accuracy, latency, cost, and operational stability.

Explain when you would choose fine-tuning versus in-context learning for this task.

Provide Python code for preprocessing, training, inference, and evaluation.

Recommend a final production approach and justify trade-offs clearly.

Problem

Business Context

Data

Success Criteria

Constraints

Requirements

Choose Fine-Tuning for Support Routing

Problem

Business Context

Data

Success Criteria

Constraints

Requirements