DONE by NONE receives a steady stream of short user comments from product feedback forms, support tickets, and app store reviews. The data team wants a simple NLP pipeline that turns raw text into structured categories so product and support teams can triage issues faster.
You are given roughly 80,000 English feedback messages collected over 12 months from DONE by NONE surfaces. Messages range from 5 to 180 words (median ~28 words). Labels are manually assigned into 4 classes: Bug Report (30%), Feature Request (25%), Billing/Account (15%), and General Praise/Other (30%). Text is noisy: typos, emojis, URLs, repeated punctuation, and product-specific terms.
A good solution should achieve at least 0.80 macro-F1 on a held-out test set, with precision and recall above 0.75 for Bug Report and Billing/Account. The approach should be interpretable enough for non-ML stakeholders to understand common tokens driving predictions.