A



Meta data engineers often model analytics data for products like Facebook Feed, Instagram Reels, and Ads Manager. The schema choice affects query simplicity, storage efficiency, and how easy it is to maintain dimensions over time.
Compare star schema and snowflake schema in a warehouse or analytics environment. Explain:
Keep the answer practical. The interviewer is looking for a clear comparison, not textbook definitions only. Discuss how the choice impacts SQL query complexity, BI/reporting use cases, and dimension management at scale. If helpful, reference fact tables, dimension tables, and common analytical joins.
A star schema has a central fact table connected directly to denormalized dimension tables. It is optimized for analytics because queries are usually simpler and require fewer joins.
SELECT d.device_type, SUM(f.impressions) AS total_impressions
FROM ads_delivery_fact f
JOIN device_dim d ON f.device_key = d.device_key
GROUP BY d.device_type;
A snowflake schema normalizes dimensions into multiple related tables, such as splitting geography into city, state, and country tables. This reduces redundancy but increases join depth and query complexity.
SELECT c.country_name, SUM(f.impressions) AS total_impressions
FROM ads_delivery_fact f
JOIN city_dim ci ON f.city_key = ci.city_key
JOIN state_dim s ON ci.state_key = s.state_key
JOIN country_dim c ON s.country_key = c.country_key
GROUP BY c.country_name;
Fact tables store measurable events such as impressions, clicks, or watch time, while dimension tables provide descriptive attributes like device, region, or campaign. The schema design determines how those attributes are organized and queried.
SELECT campaign_key, event_date, impressions, clicks
FROM ads_delivery_fact;
Star schemas reduce the number of joins analysts must write, which improves readability and lowers the chance of query errors. Snowflake schemas can be easier to maintain centrally when dimension attributes are reused across many domains.
Star schemas intentionally denormalize dimensions for analytical convenience, while snowflake schemas normalize them to reduce duplication and enforce cleaner hierarchical relationships. Neither is universally better; the right choice depends on workload and governance needs.