Business Context
LearnLoop, an online technical education platform, wants an NLP system that can explain how large language models work in language appropriate for different audiences such as middle-school students, non-technical professionals, and junior engineers. The goal is to generate accurate, readable explanations while also labeling each explanation by concept coverage so curriculum teams can audit quality.
Data
- Volume: 180,000 instructional text snippets, FAQ entries, model cards, and human-written explanations
- Text length: 40-900 words per document; generated outputs should be 120-300 words
- Language: English only
- Labels: 6 concept tags with multi-label distribution: tokenization, embeddings, attention, training, inference, limitations
- Class balance: Uneven; attention and tokenization appear frequently, limitations less often
Success Criteria
- Generated explanations must be factually correct and readable for the target audience
- Multi-label concept classifier should achieve macro-F1 >= 0.84
- Explanation generation should achieve strong human ratings on clarity and correctness
- End-to-end response latency should stay under 1.5 seconds for a single request
Constraints
- Must run on a single A10G GPU in a private environment
- No external API calls at inference time
- Explanations must avoid unsupported claims and clearly state limitations of LLMs
Requirements
- Build a system that generates audience-specific explanations of LLM mechanics
- Add a multi-label classifier to verify which concepts are covered in each explanation
- Define a preprocessing pipeline for educational and technical text
- Implement a modern Python solution using Hugging Face Transformers
- Propose evaluation for factuality, readability, concept coverage, and latency
- Describe trade-offs between a single generative model and a generate-then-classify pipeline