You are working on an enterprise planning platform and have been asked to analyze roughly 800,000 customer feedback records collected from support cases, survey comments, implementation notes, and account reviews over the last 18 months. The text is noisy and mixed-format, with short comments, long free-text complaints, duplicated boilerplate, product terminology, and a small amount of non-English content. Product, support, and operations leaders want actionable insights such as major pain points, sentiment trends, and recurring themes by product area, but only about 12,000 records have reliable manual labels. You need an NLP approach that can turn unstructured feedback into themes and prioritizable insights that can be consumed by business stakeholders.
How would you design an end-to-end NLP solution to process this feedback, extract meaningful themes and sentiment, and translate the output into actionable recommendations for product and customer-facing teams? Explain the modeling choices, preprocessing pipeline, and how you would evaluate whether the insights are trustworthy and useful.