You’re a data engineer supporting the analytics team at a large e-commerce marketplace (~5M daily active users, ~10M order items/day). Finance uses daily GMV (gross merchandise value) to forecast revenue and detect demand shocks. However, daily GMV is noisy due to promotions and weekend effects, so stakeholders want a 7-day moving average to smooth the trend.
The source of truth is an orders table where each row is an order (not an item). Orders can be paid, refunded, or canceled. Only paid orders should contribute to GMV.
Write a SQL query that returns daily paid GMV and its 7-day trailing moving average (including the current day) for a given date range.
order_total_usd for orders where status = 'paid'.order_date (UTC date) within 2024-01-01 to 2024-01-10 (inclusive).gmv_7d_moving_avg as the average of daily_gmv_usd over the window current day and previous 6 days.order_date ascending.orders| column | type | description |
|---|---|---|
| order_id | BIGINT | Primary key |
| user_id | BIGINT | Buyer user id |
| order_ts | TIMESTAMP | Order creation timestamp in UTC |
| status | VARCHAR(20) | Order status: paid, refunded, canceled |
| order_total_usd | DECIMAL(10,2) | Total order amount in USD |
orders| order_id | user_id | order_ts | status | order_total_usd |
|---|---|---|---|---|
| 9001 | 101 | 2024-01-01 10:15:00 | paid | 120.00 |
| 9002 | 102 | 2024-01-02 08:05:00 | paid | 80.00 |
| 9003 | 103 | 2024-01-03 12:30:00 | canceled | 200.00 |
| 9004 | 101 | 2024-01-03 18:10:00 | paid | 100.00 |
| 9005 | 104 | 2024-01-04 09:00:00 | paid | 60.00 |
| order_date | daily_gmv_usd | gmv_7d_moving_avg |
|---|---|---|
| 2024-01-01 | 120.00 | 120.00 |
| 2024-01-02 | 80.00 | 100.00 |
| 2024-01-03 | 100.00 | 100.00 |
| 2024-01-04 | 60.00 | 90.00 |
Note: In real data, you may have multiple paid orders per day; the query should aggregate them before computing the moving average.