
You inherit a messy spreadsheet or reporting file with inconsistent formats, blank cells, duplicate rows, and unclear column meanings. Before you can trust it for analysis, you need a quick but disciplined way to assess its quality.
How do you approach cleaning and validating the file so the final output is reliable? In your answer, explain how you identify obvious data issues, decide what to standardize versus flag for review, and verify that the cleaned data still matches the original totals or counts where appropriate.
Keep your answer practical and SQL-oriented. Focus on the checks you would run, the kinds of transformations you would apply, and how you would confirm the file is ready for reporting without silently changing the underlying meaning of the data.