





Slow queries affect application latency, reporting jobs, and database cost. In PostgreSQL interviews, this question tests whether you can diagnose performance issues methodically instead of guessing.
Explain how you would optimize a slow-running query in PostgreSQL. Your answer should cover:
EXPLAIN or EXPLAIN ANALYZEThe interviewer expects a practical, structured explanation rather than a list of random tips. Focus on a repeatable workflow: inspect the execution plan, identify the expensive step, choose an appropriate optimization, and verify the impact. Mention PostgreSQL-specific considerations such as sequential scans, index scans, join strategies, and the trade-offs of adding indexes.
The first step is to inspect how PostgreSQL actually executes the query. EXPLAIN ANALYZE shows estimated cost, actual timing, row counts, and whether the planner chose sequential scans, index scans, sorts, hash joins, or nested loops.
EXPLAIN ANALYZE
SELECT o.customer_id, SUM(o.total_amount)
FROM orders o
WHERE o.order_date >= DATE '2024-01-01'
GROUP BY o.customer_id;
Indexes help when queries frequently filter, join, or sort on specific columns. In PostgreSQL, the right index can reduce full-table scans, but unnecessary indexes increase write cost and storage usage.
CREATE INDEX idx_orders_order_date_customer_id
ON orders (order_date, customer_id);
A slow query is not always fixed by adding an index. Rewriting predicates, reducing selected columns, pre-aggregating data, or filtering earlier in a CTE or subquery can significantly reduce work.
WITH recent_orders AS (
SELECT customer_id, total_amount
FROM orders
WHERE order_date >= DATE '2024-01-01'
)
SELECT customer_id, SUM(total_amount)
FROM recent_orders
GROUP BY customer_id;
PostgreSQL may choose nested loop, hash join, or merge join depending on table size, indexes, and row estimates. Understanding why a join strategy was chosen helps determine whether the issue is missing indexes, stale statistics, or an inefficient query shape.
EXPLAIN ANALYZE
SELECT c.customer_name, o.order_id
FROM customers c
JOIN orders o
ON c.customer_id = o.customer_id;
Optimization is only successful if the query remains correct and measurably faster. You should compare execution time, buffers, row counts, and result sets before and after the change.
EXPLAIN (ANALYZE, BUFFERS)
SELECT customer_id, SUM(total_amount)
FROM orders
WHERE order_date >= DATE '2024-01-01'
GROUP BY customer_id;