Business Context
ShopSphere, a large e-commerce marketplace, wants to improve search ranking and navigation by identifying the underlying intent behind each user query. The goal is to classify queries into intent categories such as transactional, informational, navigational, and support so search results and downstream experiences can be tailored correctly.
Data
- Volume: 8M historical search queries with 1.6M human-labeled examples
- Text length: 1-20 tokens per query (median: 4 tokens)
- Language: English only for the first version
- Label distribution: Transactional 46%, Informational 24%, Navigational 18%, Support 12%
- Noise: Misspellings, abbreviations, brand names, SKU-like strings, and incomplete phrases are common
Success Criteria
A good solution should achieve ≥88% macro-F1 overall and ≥92% precision on transactional queries because misrouting high-purchase-intent traffic hurts revenue. The model should support online inference with p95 latency under 50ms.
Constraints
- Must run in a real-time search stack on CPU-backed inference nodes
- Query text is short and often ambiguous without session context
- The first production version can only use the raw query text and lightweight derived features
Requirements
- Build a multi-class NLP system to classify each search query into one of the four intent classes.
- Design a preprocessing pipeline for short, noisy search text.
- Implement a modern Python solution using a transformer-based model and compare it to a lightweight baseline.
- Explain how you would handle class imbalance, ambiguous queries, and out-of-vocabulary terms.
- Define an evaluation plan, including offline metrics and error analysis for confusing intent pairs.
- Describe what you would deploy to production given the latency constraint.