

A

You are responsible for a model that is already live in production. The team wants a clear way to judge whether it is still performing well, whether the current threshold is right, and how to compare offline results with what is happening on real traffic.
How do you handle model evaluation in production?