Business Context
ShopSphere, a large e-commerce marketplace, wants to improve search relevance for short, ambiguous user queries such as "apple charger fast", "running shoes flat feet", and "couch under 500". The current keyword-based system misses semantic intent, synonyms, and attribute constraints, so the search team wants an NLP pipeline that uses embeddings or transformers to better understand queries before retrieval and ranking.
Data
- Volume: 8M historical search queries, 120M product titles/descriptions, and 35M query-click pairs
- Text length: Queries are short (2-12 tokens, median 4); product text ranges from 5-300 tokens
- Language: English only for the first release
- Labels: Weak supervision from clicks, add-to-cart, and purchases; query intent labels available for 250K manually reviewed queries
- Class distribution: Head queries are frequent, but 60% of traffic is long-tail or reformulated queries
Success Criteria
A good solution should improve query understanding enough to increase offline Recall@20 and NDCG@10 over the keyword baseline, while reducing zero-result and low-engagement searches. Inference should support near-real-time search traffic.
Constraints
- P95 online query understanding latency must remain under 60ms
- The model must run on a single A10 or equivalent CPU fallback path
- Product catalog updates hourly, so document embeddings must support incremental refresh
Requirements
- Design a query understanding system using embeddings or transformers for semantic retrieval and/or intent extraction.
- Build a preprocessing pipeline for noisy, short e-commerce queries.
- Implement a modern Python solution for training, inference, and evaluation.
- Explain how you would combine semantic signals with lexical search.
- Define offline and online evaluation metrics, failure modes, and rollout criteria.