Business Context
BrightWave, a consumer fintech brand, ran a multi-channel marketing campaign and collected customer feedback from surveys, app reviews, social comments, and email replies. The marketing analytics team wants an NLP pipeline that uses sentiment analysis to measure campaign reception and identify which messages drove positive or negative reactions.
Data
- Volume: 180,000 feedback records from a 6-week campaign
- Text length: 5-280 words (median: 34 words)
- Language: English only
- Labels: Positive (52%), Neutral (28%), Negative (20%)
- Sources: Survey free text, Instagram comments, X posts, support emails, app store reviews
- Noise: Emojis, hashtags, URLs, duplicated comments, campaign-specific slang, misspellings
Success Criteria
A good solution should achieve macro-F1 >= 0.82 on held-out feedback, maintain negative-class recall >= 0.85, and produce source-level sentiment summaries that marketing stakeholders can use to compare campaign channels and creative variants.
Constraints
- Inference should support daily batch scoring of new feedback within 30 minutes
- The solution must be explainable enough for non-technical marketing teams
- Training must run on standard Python infrastructure with optional single-GPU acceleration
Requirements
- Build a sentiment analysis system that classifies feedback into positive, neutral, or negative.
- Design a preprocessing pipeline for noisy campaign text, including emojis, hashtags, URLs, and duplicate feedback.
- Implement a modern Python solution for training, validation, and batch inference.
- Compare campaign sentiment across channels, dates, or creative variants.
- Explain how you would evaluate model quality, especially for negative feedback detection.
- Describe how you would surface common themes in negative feedback for campaign improvement.