Business Context
HelpDeskPro, a SaaS customer support platform, wants to automate first-draft email replies for inbound support tickets. You need to design and explain a sequence-to-sequence system that maps a customer message to a fluent, accurate response draft that agents can review before sending.
Data
- Volume: 850,000 historical ticket-response pairs collected over 18 months
- Text length: customer messages range from 20-900 tokens; agent replies range from 15-600 tokens
- Language: English only
- Label quality: responses are human-written but vary in style and completeness; about 12% contain boilerplate signatures that should be normalized
- Domain skew: billing and login issues make up 58% of tickets, while API and security issues are less frequent
Success Criteria
A good solution should generate responses that are contextually relevant, factually consistent with the ticket, and concise enough for agent review. Target ROUGE-L >= 0.42 on a held-out test set and human acceptability >= 80% in offline review, while keeping median inference latency under 250 ms per request on a single A10 GPU.
Constraints
- No external API calls at inference time
- Must run in a secure internal environment
- Responses must avoid hallucinating refunds, account actions, or policy exceptions
- Maximum model size: roughly 300M parameters
Requirements
- Explain how sequence-to-sequence models work for input-to-output text generation.
- Build a modern Python implementation for support-reply generation.
- Define a realistic preprocessing pipeline for ticket-response pairs.
- Fine-tune an encoder-decoder transformer and justify your architecture choice.
- Evaluate both generation quality and business safety risks.
- Describe failure modes and how you would reduce hallucinations and repetitive output.