Business Context
RetailCorp, a mid-sized retail chain with 200 stores and $300M in annual revenue, aims to enhance its sales forecasting accuracy to optimize inventory and reduce stockouts. They have observed fluctuations in demand due to seasonal changes and external factors, such as promotions and economic indicators. The data science team is tasked with improving the forecasting model's performance through effective feature engineering.
Dataset
| Feature Group | Count | Examples |
|---|
| Historical Sales | 50K | daily_sales, returns |
| External Factors | 10 | promotions, holidays, competitor_price |
| Store Features | 5 | store_size, region, store_type |
| Temporal Features | 7 | day_of_week, month, quarter |
- Size: 50K daily records spanning 3 years, 72 features
- Target: Continuous — total sales for the next day
- Class balance: N/A (regression task)
- Missing data: 10% missing in promotions and competitor_price features
Requirements
- Propose a feature engineering strategy to improve sales forecasting.
- Identify and implement at least three new features derived from existing data.
- Evaluate the impact of your feature engineering on model performance using RMSE and R-squared metrics.
- Discuss how you would handle missing data in the dataset.
Constraints
- The model must be retrained weekly with new sales data.
- Inference for daily sales predictions should be completed within 1 hour of data collection.
- The solution should be interpretable for stakeholders to understand the impact of features on sales predictions.