
Apple Retail transaction data supports reporting on sales, discounts, stores, products, and customer behavior. Interviewers ask this to evaluate whether you can translate operational retail data into an analytics-friendly warehouse design.
Explain how you would design a data warehouse schema for Apple Retail transactions. Your answer should cover:
Keep the discussion practical and warehouse-focused rather than application-focused. The interviewer expects a clear dimensional model, tradeoffs between star and snowflake approaches, and how the schema enables common SQL analysis such as revenue by store, product performance, and customer purchase trends.
The most important design decision is the grain of the fact table. For Apple Retail, a strong default is one row per transaction line item, because it supports product-level sales, discounts, quantities, and returns without losing detail.
SELECT store_key, product_key, SUM(net_sales_amount) AS revenue
FROM fact_sales
GROUP BY store_key, product_key;
Dimensions such as date, store, product, and customer should be reusable across multiple fact tables. This allows analysts to compare sales, returns, inventory, and promotions using the same business definitions.
SELECT d.calendar_month, p.product_family, SUM(f.net_sales_amount)
FROM fact_sales f
JOIN dim_date d ON f.date_key = d.date_key
JOIN dim_product p ON f.product_key = p.product_key
GROUP BY d.calendar_month, p.product_family;
A star schema keeps dimensions denormalized enough for straightforward joins and fast analytics. This is usually preferred for BI workloads because it reduces query complexity compared with highly normalized transactional schemas.
SELECT s.store_name, SUM(f.units_sold)
FROM fact_sales f
JOIN dim_store s ON f.store_key = s.store_key
GROUP BY s.store_name;
Some dimension attributes change over time, such as store region assignments or product pricing bands. Type 2 slowly changing dimensions preserve historical context by creating a new dimension row when tracked attributes change.
SELECT product_key, product_sku, effective_start_date, effective_end_date
FROM dim_product;
Sales, returns, inventory snapshots, and employee performance may represent different business processes and should often be modeled as separate fact tables. This avoids mixing incompatible grains and makes metrics more reliable.
WITH monthly_sales AS (
SELECT date_key, store_key, SUM(net_sales_amount) AS revenue
FROM fact_sales
GROUP BY date_key, store_key
)
SELECT *
FROM monthly_sales;