At NovaMind, the applied AI team tracks fast-moving generative AI updates from research blogs, model release notes, newsletters, and technical forums. Recruiters use this exercise to test whether candidates can turn an open-ended question about “staying updated” into a practical NLP system for organizing and prioritizing information.
You are given a corpus of 180,000 English documents collected over 18 months from arXiv abstracts, vendor blogs, GitHub release notes, benchmark reports, and curated newsletters. Documents range from 40 to 1,200 words (median: 220). Each document is labeled into one of five update types: Model Release (28%), Research Paper (24%), Tooling/Framework (18%), Safety/Policy (12%), and Application Case Study (18%). Roughly 6% of records contain boilerplate, duplicated snippets, or malformed HTML.
A good solution should achieve macro-F1 ≥ 0.84, F1 ≥ 0.88 on Safety/Policy documents, and support batch inference under 150 ms per document on a single T4 GPU. The system should be robust to noisy formatting and domain-specific terminology such as RAG, LoRA, quantization, eval harnesses, and synthetic data.