You are working on a recommendation system for a digital product with personalized content feeds. The team wants a clear framework to judge model quality before launch and business impact after launch, across the full pipeline from candidate generation to final ranking.
How would you evaluate a recommendation system end to end?
Stage-wise evaluation for retrieval, ranking, and re-rankingOffline metrics versus online experiment designGuardrails and business impact measurementFeature drift and training-serving skew awareness