Interview Guides

Analyze Uber Supply with Joins | Dataford Interview Questions - Dataford - Ace your Interview

Analyze Uber Supply with Joins

Easy

SQL & Data Manipulation

Asked at 1 company1JoinsAggregationsData Wrangling

Also asked at

Problem

Context

At Uber, Operations Managers often need to evaluate supply performance across cities by combining trip activity with driver and city attributes. This requires joining a fact table with dimension tables in a way that preserves the right level of detail.

Core question

Explain how you would join trips, drivers, and cities tables to analyze supply performance in Uber. In your answer, discuss:

Which keys you would join on
Which join types you would choose and why
What supply metrics you would aggregate
How you would avoid common mistakes like double counting or dropping unmatched records

Scope guidance

The interviewer is looking for a practical SQL-oriented explanation, not just definitions of joins. You should describe the grain of each table, how to structure the query, and how to produce city-level supply metrics such as active drivers, completed trips, and utilization.

Key Concepts

Fact vs Dimension Tables

In this setup, trips is typically the fact table because it stores event-level ride records, while drivers and cities are dimension tables that provide descriptive attributes. Understanding table grain is critical because aggregations should be built from the fact table and enriched with dimensions only after confirming the join keys.

SELECT t.trip_id, d.driver_id, c.city_name
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id;

Choosing the Correct Join Type

An INNER JOIN keeps only rows with matches in both tables, while a LEFT JOIN preserves all rows from the left table. For supply analysis, you often start from trips if you only care about trip activity, but start from drivers with LEFT JOIN if you need to include drivers who had zero trips.

SELECT d.driver_id, c.city_name, t.trip_id
FROM drivers d
LEFT JOIN trips t ON d.driver_id = t.driver_id
LEFT JOIN cities c ON d.city_id = c.city_id;

Aggregation at the Right Grain

Supply metrics should be computed at a clear reporting grain, such as city-day or city-week. If you aggregate after joining tables with mismatched grain, you can inflate counts, so it is important to group by the reporting dimensions and use distinct counts when needed.

SELECT c.city_name, COUNT(DISTINCT t.driver_id) AS active_drivers, COUNT(*) AS trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Conditional Aggregation for Supply Metrics

Many operational metrics require counting subsets of rows, such as completed trips, canceled trips, or online drivers. CASE WHEN inside aggregate functions lets you compute multiple KPIs in one grouped query.

SELECT c.city_name,
       COUNT(*) AS total_trips,
       SUM(CASE WHEN t.status = 'completed' THEN 1 ELSE 0 END) AS completed_trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Avoiding Double Counting

Double counting happens when a join multiplies rows, usually because one side is not unique at the join key. Before joining, confirm whether drivers has one row per driver and cities has one row per city, and use COUNT(DISTINCT ...) when the metric is entity-based rather than row-based.

SELECT c.city_name, COUNT(DISTINCT d.driver_id) AS drivers_seen
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Problem

Context

Core question

Explain how you would join trips, drivers, and cities tables to analyze supply performance in Uber. In your answer, discuss:

Which keys you would join on
Which join types you would choose and why
What supply metrics you would aggregate
How you would avoid common mistakes like double counting or dropping unmatched records

Scope guidance

Key Concepts

Fact vs Dimension Tables

SELECT t.trip_id, d.driver_id, c.city_name
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id;

Choosing the Correct Join Type

SELECT d.driver_id, c.city_name, t.trip_id
FROM drivers d
LEFT JOIN trips t ON d.driver_id = t.driver_id
LEFT JOIN cities c ON d.city_id = c.city_id;

Aggregation at the Right Grain

SELECT c.city_name, COUNT(DISTINCT t.driver_id) AS active_drivers, COUNT(*) AS trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Conditional Aggregation for Supply Metrics

SELECT c.city_name,
       COUNT(*) AS total_trips,
       SUM(CASE WHEN t.status = 'completed' THEN 1 ELSE 0 END) AS completed_trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Avoiding Double Counting

SELECT c.city_name, COUNT(DISTINCT d.driver_id) AS drivers_seen
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Your answer

Try one AI text evaluation on us

Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.

0 wordstarget ~200

Up next

IJoining Orders, Shoppers, and RetailersEasy

Join vs Union in AnalyticsEasy

Troubleshooting Missing Rows with JoinsEasy

Next question

Analyze Uber Supply with Joins

Easy

SQL & Data Manipulation

Asked at 1 company1JoinsAggregationsData Wrangling

Also asked at

Problem

Context

Core question

Explain how you would join trips, drivers, and cities tables to analyze supply performance in Uber. In your answer, discuss:

Which keys you would join on
Which join types you would choose and why
What supply metrics you would aggregate
How you would avoid common mistakes like double counting or dropping unmatched records

Scope guidance

Key Concepts

Fact vs Dimension Tables

SELECT t.trip_id, d.driver_id, c.city_name
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id;

Choosing the Correct Join Type

SELECT d.driver_id, c.city_name, t.trip_id
FROM drivers d
LEFT JOIN trips t ON d.driver_id = t.driver_id
LEFT JOIN cities c ON d.city_id = c.city_id;

Aggregation at the Right Grain

SELECT c.city_name, COUNT(DISTINCT t.driver_id) AS active_drivers, COUNT(*) AS trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Conditional Aggregation for Supply Metrics

SELECT c.city_name,
       COUNT(*) AS total_trips,
       SUM(CASE WHEN t.status = 'completed' THEN 1 ELSE 0 END) AS completed_trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Avoiding Double Counting

SELECT c.city_name, COUNT(DISTINCT d.driver_id) AS drivers_seen
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Problem

Context

Core question

Explain how you would join trips, drivers, and cities tables to analyze supply performance in Uber. In your answer, discuss:

Which keys you would join on
Which join types you would choose and why
What supply metrics you would aggregate
How you would avoid common mistakes like double counting or dropping unmatched records

Scope guidance

Key Concepts

Fact vs Dimension Tables

SELECT t.trip_id, d.driver_id, c.city_name
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id;

Choosing the Correct Join Type

SELECT d.driver_id, c.city_name, t.trip_id
FROM drivers d
LEFT JOIN trips t ON d.driver_id = t.driver_id
LEFT JOIN cities c ON d.city_id = c.city_id;

Aggregation at the Right Grain

SELECT c.city_name, COUNT(DISTINCT t.driver_id) AS active_drivers, COUNT(*) AS trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Conditional Aggregation for Supply Metrics

SELECT c.city_name,
       COUNT(*) AS total_trips,
       SUM(CASE WHEN t.status = 'completed' THEN 1 ELSE 0 END) AS completed_trips
FROM trips t
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Avoiding Double Counting

SELECT c.city_name, COUNT(DISTINCT d.driver_id) AS drivers_seen
FROM trips t
JOIN drivers d ON t.driver_id = d.driver_id
JOIN cities c ON t.city_id = c.city_id
GROUP BY c.city_name;

Your answer

Try one AI text evaluation on us

Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.

0 wordstarget ~200

Up next

IJoining Orders, Shoppers, and RetailersEasy

Join vs Union in AnalyticsEasy

Troubleshooting Missing Rows with JoinsEasy

Next question