Business Context
You’re a data engineer supporting the analytics team at a large e-commerce marketplace (~5M daily active users, ~10M order items/day). Finance uses daily GMV (gross merchandise value) to forecast revenue and detect demand shocks. However, daily GMV is noisy due to promotions and weekend effects, so stakeholders want a 7-day moving average to smooth the trend.
The source of truth is an orders table where each row is an order (not an item). Orders can be paid, refunded, or canceled. Only paid orders should contribute to GMV.
Task
Write a SQL query that returns daily paid GMV and its 7-day trailing moving average (including the current day) for a given date range.
Requirements
- Aggregate daily GMV as the sum of
order_total_usd for orders where status = 'paid'.
- Return one row per
order_date (UTC date) within 2024-01-01 to 2024-01-10 (inclusive).
- Compute
gmv_7d_moving_avg as the average of daily_gmv_usd over the window current day and previous 6 days.
- If fewer than 7 prior days exist (early in the range), compute the average over the available days.
- Order results by
order_date ascending.
Table Definitions
orders
| column | type | description |
|---|
| order_id | BIGINT | Primary key |
| user_id | BIGINT | Buyer user id |
| order_ts | TIMESTAMP | Order creation timestamp in UTC |
| status | VARCHAR(20) | Order status: paid, refunded, canceled |
| order_total_usd | DECIMAL(10,2) | Total order amount in USD |
Sample Data
orders
| order_id | user_id | order_ts | status | order_total_usd |
|---|
| 9001 | 101 | 2024-01-01 10:15:00 | paid | 120.00 |
| 9002 | 102 | 2024-01-02 08:05:00 | paid | 80.00 |
| 9003 | 103 | 2024-01-03 12:30:00 | canceled | 200.00 |
| 9004 | 101 | 2024-01-03 18:10:00 | paid | 100.00 |
| 9005 | 104 | 2024-01-04 09:00:00 | paid | 60.00 |
Expected Output (for the sample data)
| order_date | daily_gmv_usd | gmv_7d_moving_avg |
|---|
| 2024-01-01 | 120.00 | 120.00 |
| 2024-01-02 | 80.00 | 100.00 |
| 2024-01-03 | 100.00 | 100.00 |
| 2024-01-04 | 60.00 | 90.00 |
Note: In real data, you may have multiple paid orders per day; the query should aggregate them before computing the moving average.