

A
Data teams at companies like Spotify work with both relational tables and raw content such as text, images, and logs. Interviewers ask this question to check whether you understand how data shape affects storage, querying, and analytics workflows.
Explain the difference between structured and unstructured data. In your answer, cover:
Keep the answer practical and database-focused. You do not need to go deep into machine learning or distributed systems, but you should clearly explain why structured data is easier to query with SQL and why unstructured data often requires preprocessing, extraction, or transformation before analysis.
Structured data follows a predefined schema with clearly defined columns, data types, and relationships. It is typically stored in relational databases and is straightforward to query with SQL using filters, joins, and aggregations.
SELECT customer_id, SUM(order_amount) AS total_spend
FROM orders
GROUP BY customer_id;
Unstructured data does not fit neatly into rows and columns and usually lacks a fixed schema. Common examples include free-form text, images, audio, video, PDFs, and raw application logs.
The main difference is not whether data is valuable, but whether it is organized in a predictable format. Structured data can be queried directly with SQL, while unstructured data often needs parsing, labeling, or feature extraction before it becomes analytically useful.
SELECT event_id, payload->>'user_id' AS user_id
FROM raw_events;
Structured data is commonly stored in relational systems such as PostgreSQL, where constraints and types enforce consistency. Unstructured data is often stored in object storage, document stores, or data lakes, then processed into structured tables for reporting.