Data quality issues often show up as duplicate rows and missing values. In analytics and operational systems, both can distort counts, aggregations, and downstream reporting if they are not handled carefully.
Explain how you would handle duplicate records and NULL values in a dataset using SQL. Your answer should cover:
The interviewer expects a practical explanation, not just definitions. Discuss the trade-offs between deleting data, deduplicating in queries, and preventing bad data at ingestion. Use simple SQL examples and mention common mistakes, especially around COUNT, GROUP BY, and comparisons with NULL.