Business Context
ShopSphere, an ecommerce marketplace, collects product reviews, post-purchase survey comments, and support feedback. The customer insights team wants a sentiment analysis system to automatically classify feedback so product and operations teams can detect issues faster.
Data
You have 420,000 labeled feedback records from the last 18 months.
- Task: classify each feedback item as positive, neutral, or negative
- Text sources: product reviews, app feedback, delivery comments, return reasons
- Text length: 5-300 words, median 38 words
- Language: English only
- Label distribution: positive 61%, neutral 17%, negative 22%
- Noise: emojis, repeated punctuation, misspellings, SKU codes, order IDs, and occasional HTML fragments
Success Criteria
A production-ready solution should achieve:
- Macro-F1 >= 0.82 on a held-out test set
- Negative-class recall >= 0.88 so critical complaints are not missed
- Inference latency < 50 ms per record in batch scoring
Constraints
- Must run on a single CPU service for baseline deployment
- Predictions should be explainable enough for business users to inspect common drivers of negative sentiment
- The pipeline should support weekly retraining with newly labeled feedback
Requirements
- Build a 3-class sentiment classifier for product feedback.
- Design a realistic preprocessing pipeline for noisy ecommerce text.
- Implement a strong baseline in modern Python using scikit-learn and compare it with a lightweight transformer approach.
- Address class imbalance and justify your modeling choices.
- Define evaluation metrics, validation strategy, and error analysis steps.
- Explain how you would package the model for batch and near-real-time inference.