StreamCart uses a binary relevance model to rank and filter product search results for its mobile app. The team recently tightened the decision threshold to reduce irrelevant results, but user complaints about “missing obvious items” increased while support tickets about “spammy results” fell slightly.
| Metric | Previous Model | Current Model | Change |
|---|---|---|---|
| Precision | 0.68 | 0.84 | +0.16 |
| Recall | 0.81 | 0.52 | -0.29 |
| F1 Score | 0.74 | 0.64 | -0.10 |
| Accuracy | 0.89 | 0.91 | +0.02 |
| False Positive Rate | 0.09 | 0.04 | -0.05 |
| False Negative Rate | 0.19 | 0.48 | +0.29 |
| Search reformulation rate | 12.5% | 18.9% | +6.4 pts |
| Result click-through rate | 31.2% | 27.4% | -3.8 pts |
The product manager wants to understand the difference between precision and recall in terms of user experience, and whether the current model is actually better despite higher accuracy and precision. You need to explain what these metrics mean for shoppers, diagnose the tradeoff, and recommend what to optimize next.