Business Context
ShopSphere, an e-commerce marketplace, receives customer feedback from app reviews, support chats, post-purchase surveys, and seller complaints. The operations team wants an NLP system that converts unstructured text into actionable business categories so product, support, and logistics teams can prioritize work and measure impact.
Data
- Volume: 850,000 historical feedback records over 18 months; ~12,000 new records per day
- Text length: 8-700 words (median 64 words)
- Language: English only for the first version
- Labels: 5 business use-case classes —
Delivery Issue (28%), Product Quality (22%), Billing/Refund (14%), App/Website UX (18%), General Praise/Other (18%)
- Data quality: Duplicates, HTML fragments, emojis, repeated punctuation, and agent signatures are common
Success Criteria
A production-ready classifier should achieve macro-F1 >= 0.84, recall >= 0.90 for Billing/Refund, and support batch or near-real-time routing with p95 inference latency under 120 ms per record.
Constraints
- Must run in ShopSphere's AWS environment on a single T4 GPU or CPU fallback
- Predictions should be explainable enough for business stakeholders to trust routing decisions
- Weekly retraining is allowed; online learning is not required
Requirements
- Build a multi-class NLP pipeline that maps feedback text to the 5 business categories.
- Define a realistic preprocessing pipeline for noisy customer text.
- Implement and compare a strong baseline and a transformer-based model in Python.
- Handle class imbalance and justify your loss function or sampling strategy.
- Evaluate the system with metrics appropriate for business routing.
- Describe how you would use model outputs to support business workflows such as refund escalation, seller monitoring, and product backlog prioritization.