
Query performance matters in production systems because slow SQL can increase latency, raise infrastructure costs, and block downstream workloads. Interviewers often ask this to assess whether you understand both query-writing habits and database behavior.
Explain how you optimize SQL queries for performance in PostgreSQL. Your answer should cover:
Keep the discussion at an easy, practical level. Focus on foundational techniques such as selecting only needed columns, filtering early, avoiding unnecessary work, using appropriate indexes, and validating changes with EXPLAIN or EXPLAIN ANALYZE. You do not need to cover advanced internals, but you should show a clear, structured approach to diagnosing and improving query speed.
The first step in optimization is understanding how PostgreSQL executes the query. EXPLAIN shows the planned operations, and EXPLAIN ANALYZE shows actual runtime behavior, helping you find expensive scans, sorts, and aggregations.
EXPLAIN ANALYZE
SELECT customer_id, SUM(amount)
FROM orders
WHERE order_date >= DATE '2024-01-01'
GROUP BY customer_id;
Indexes help PostgreSQL find rows faster without scanning the entire table. They are most useful on columns used frequently in WHERE, JOIN, and ORDER BY, but they add storage cost and can slow down inserts and updates.
CREATE INDEX idx_orders_order_date ON orders(order_date);
Queries often become slow because they process more data than needed. Selecting only required columns, filtering early, and avoiding unnecessary sorting or aggregation can reduce CPU, memory, and I/O usage.
SELECT customer_id, amount
FROM orders
WHERE status = 'completed';
How you write filters affects whether PostgreSQL can use indexes efficiently. Simple, direct predicates usually perform better than wrapping indexed columns in functions or using broad conditions that force full scans.
SELECT *
FROM orders
WHERE order_date >= DATE '2024-01-01'
AND order_date < DATE '2024-02-01';
Optimization is not guessing; it requires measurement. After making a change, compare execution plans and runtimes to confirm that the query actually improved and did not just shift cost elsewhere.
EXPLAIN ANALYZE
SELECT order_id, customer_id
FROM orders
WHERE status = 'completed';