

At Didi Chuxing, product performance can be measured from several sources such as Didi App event logs, order transaction tables, and dashboard rollups. These sources often disagree because they differ in latency, granularity, and business logic.
Explain how you would prioritize data sources when analyzing product performance. Your answer should cover:
The interviewer is not looking for a single “correct” source. They want a structured framework for choosing among sources, plus a practical explanation of how you would validate that choice with SQL checks and basic aggregations.
The right source depends on the metric definition. For example, ride requests may come from Didi App event logs, while completed GMV should usually come from finalized order or payment tables because they reflect business-confirmed transactions.
SELECT event_date, COUNT(*) AS request_events
FROM app_event_logs
WHERE event_name = 'ride_request_submitted'
GROUP BY event_date;
Low-latency sources are useful for fast reads, but they may be incomplete or subject to late-arriving data. More curated tables are often slower but more reliable for decision-making and external reporting.
SELECT MAX(event_time) AS latest_event_time
FROM app_event_logs;
You can use SQL aggregations to compare counts, sums, and null rates across sources over the same time window. This helps identify whether differences are expected due to business logic or indicate a data quality issue.
SELECT order_date, COUNT(*) AS completed_orders, SUM(fare_amount) AS total_fare
FROM trip_orders
WHERE order_status = 'completed'
GROUP BY order_date;
Event logs may contain multiple records for one user action, retries, or instrumentation noise. Before using them for product performance analysis, you should confirm the unit of analysis and whether deduplication is required.
SELECT event_id, COUNT(*) AS duplicate_count
FROM app_event_logs
GROUP BY event_id
HAVING COUNT(*) > 1;
Choosing a source is not just a technical decision. You should explain what the source includes, what it excludes, how fresh it is, and whether the number is directional or finance-grade.