Interview Guides

Diagnose Heavy Database Query Load | Dataford Interview Questions - Dataford - Ace your Interview

Diagnose Heavy Database Query Load

Hard

SQL & Data Manipulation

JoinsAggregationsData Wrangling

Problem

Context

High database load is often caused by a small number of expensive queries, inefficient joins, or poor access patterns. In an engineering management interview, you are expected to explain both how you would diagnose the issue and how you would drive remediation.

Question

You are told that a transactional database backing American Express servicing and account activity workloads is under heavy load. Explain how you would identify which queries, users, applications, or jobs are generating the load, how you would use SQL and database metadata to isolate the worst offenders, and how you would optimize the system once the source is known.

Scope guidance

Focus on a practical, SQL-centered approach rather than generic infrastructure advice. The interviewer expects you to cover query-level diagnosis, execution-plan analysis, indexing and schema considerations, workload segmentation, and the trade-offs between quick fixes and durable improvements.

Key Concepts

Workload Attribution

The first step is to attribute load to specific query patterns, application users, services, or scheduled jobs. In practice, you group similar SQL statements, measure execution frequency and total resource usage, and identify whether the issue is driven by many small queries or a few very expensive ones.

SELECT queryid, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Execution Plan Analysis

An execution plan shows how PostgreSQL reads data, joins tables, and applies filters. You use EXPLAIN and EXPLAIN ANALYZE to determine whether the optimizer is choosing sequential scans, inefficient join strategies, or large sorts and hashes that drive CPU and I/O.

EXPLAIN ANALYZE
SELECT a.account_id, SUM(t.amount)
FROM account_activity t
JOIN accounts a ON a.account_id = t.account_id
WHERE t.posted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY a.account_id;

Index and Access Path Optimization

Many heavy queries come from filtering or joining on columns without appropriate indexes, or from indexes that do not match the actual predicates. Good indexing reduces scanned rows, but too many indexes also increase write cost, so the design must reflect the workload mix.

CREATE INDEX idx_account_activity_account_posted_at
ON account_activity (account_id, posted_at);

Query Shape Improvement

Sometimes the problem is not missing infrastructure but poor SQL shape: unnecessary DISTINCTs, non-sargable predicates, repeated scans, or joining large tables before filtering. Rewriting the query with CTEs, pre-aggregation, or better predicate placement can materially reduce work.

WITH recent_activity AS (
  SELECT account_id, amount
  FROM account_activity
  WHERE posted_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT account_id, SUM(amount)
FROM recent_activity
GROUP BY account_id;

Systemic Remediation

After fixing individual queries, you still need durable controls such as query timeouts, connection pooling, workload isolation, and monitoring dashboards. This prevents one application path or batch job from degrading the entire platform.

Problem

Context

Question

Scope guidance

Key Concepts

Workload Attribution

SELECT queryid, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Execution Plan Analysis

EXPLAIN ANALYZE
SELECT a.account_id, SUM(t.amount)
FROM account_activity t
JOIN accounts a ON a.account_id = t.account_id
WHERE t.posted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY a.account_id;

Index and Access Path Optimization

CREATE INDEX idx_account_activity_account_posted_at
ON account_activity (account_id, posted_at);

Query Shape Improvement

WITH recent_activity AS (
  SELECT account_id, amount
  FROM account_activity
  WHERE posted_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT account_id, SUM(amount)
FROM recent_activity
GROUP BY account_id;

Systemic Remediation

Your answer

Try one AI text evaluation on us

Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.

0 wordstarget ~200

Up next

Identify and Stop Heavy Database UsersMedium

Optimizing Large Analytical SQL QueriesHard AOptimizing Slow Queries at ScaleHard

Next question

Diagnose Heavy Database Query Load

Hard

SQL & Data Manipulation

JoinsAggregationsData Wrangling

Problem

Context

Question

Scope guidance

Key Concepts

Workload Attribution

SELECT queryid, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Execution Plan Analysis

EXPLAIN ANALYZE
SELECT a.account_id, SUM(t.amount)
FROM account_activity t
JOIN accounts a ON a.account_id = t.account_id
WHERE t.posted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY a.account_id;

Index and Access Path Optimization

CREATE INDEX idx_account_activity_account_posted_at
ON account_activity (account_id, posted_at);

Query Shape Improvement

WITH recent_activity AS (
  SELECT account_id, amount
  FROM account_activity
  WHERE posted_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT account_id, SUM(amount)
FROM recent_activity
GROUP BY account_id;

Systemic Remediation

Problem

Context

Question

Scope guidance

Key Concepts

Workload Attribution

SELECT queryid, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Execution Plan Analysis

EXPLAIN ANALYZE
SELECT a.account_id, SUM(t.amount)
FROM account_activity t
JOIN accounts a ON a.account_id = t.account_id
WHERE t.posted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY a.account_id;

Index and Access Path Optimization

CREATE INDEX idx_account_activity_account_posted_at
ON account_activity (account_id, posted_at);

Query Shape Improvement

WITH recent_activity AS (
  SELECT account_id, amount
  FROM account_activity
  WHERE posted_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT account_id, SUM(amount)
FROM recent_activity
GROUP BY account_id;

Systemic Remediation

Your answer

Try one AI text evaluation on us

Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.

0 wordstarget ~200

Up next

Identify and Stop Heavy Database UsersMedium

Optimizing Large Analytical SQL QueriesHard AOptimizing Slow Queries at ScaleHard

Next question