You are building an NLP pipeline for a consumer goods company that wants to analyze sentiment in product reviews, retailer feedback, and social posts about oral care products such as toothpaste, toothbrushes, and mouthwash. You have roughly 800,000 historical English-language texts, but only 60,000 are labeled for sentiment, and many examples contain emojis, misspellings, sarcasm, mixed sentiment, and product-specific language like whitening, sensitivity relief, flavor, and packaging. Some inputs are short social comments, while others are multi-sentence retailer reviews that mention multiple product attributes in one document. The business wants a system that can reliably classify sentiment and surface the main drivers behind positive and negative feedback.
How would you design and implement this sentiment analysis system end to end, including preprocessing, model choice, training strategy, evaluation, and how you would handle mixed or aspect-specific sentiment in production?