You own a gradient-boosted regression model that predicts final sale price for a real estate marketplace, and the output is used to set the initial listing guidance shown to sellers. The model was validated offline before launch, but regional sales teams now report that the guidance is consistently off for high-value homes even though aggregate dashboard metrics still look acceptable. You are asked to review whether the model is actually performing well enough for production use and whether the current evaluation approach is hiding important failure modes.
| Metric | Validation Set | Last 30 Days Production |
|---|---|---|
| RMSE | $24,800 | $31,900 |
| MAE | $14,200 | $15,100 |
| Median Absolute Error | $9,100 | $9,400 |
| R-squared | 0.91 | 0.84 |
| Mean Error (Prediction - Actual) | +$1,200 | -$6,800 |
| MAPE | 6.8% | 8.9% |
| MAE for homes c $500K | $11,300 | $11,900 |
| MAE for homes e $1.5M | $52,000 | $96,000 |
How would you evaluate this regression model given these results, and what would you conclude about whether it is production-ready and what should be improved first?