You've trained a model and the team wants to know whether its current behavior points to overfitting or underfitting. You need a clear way to evaluate generalization, not just training performance.
How would you evaluate whether a model is overfitting or underfitting?