Handling Missing Data in Pipelines

MediumPipelines00:00

Practice interviewer

Your interviewer

In session

Interviewer

Welcome to your interview.

The question is on your right: Handling Missing Data in Pipelines. Take a moment with it first.

Talk your thinking through with me if you like - when you're confident, submit your answer and I'll grade it like a real screen (7/10 or better passes). Discussion and graded submissions share your five interviewer interactions, so spend them well.

You need to log in / sign up to chat or submit.

Problem

Scenario

You're building a training data pipeline and need a consistent way to deal with incomplete records before they reach downstream models. Some fields are optional, some are critical, and missing values can come from source gaps, late data, or parsing failures.

Question

How would you handle missing data in a dataset?