Design Multimodal Data Labeling Router

Product Context

LabelFlow is a managed data labeling platform used by enterprise ML teams to annotate text, images, audio, video, and document/PDF data. Customers upload tasks, and the platform must route each item to the right annotator queue, pre-label model, and quality-control workflow while keeping turnaround time and cost low.

Scale

Signal	Value
Enterprise customers	4,000
Daily active annotators	120,000
Tasks created per day	45M
Peak task-ingest QPS	18K
Peak labeling-assignment QPS	35K
Active task backlog	600M items
Input modalities	text, image, audio, video, PDF/document
p99 latency budget for assignment API	250ms

Task

Design an end-to-end ML system for a multi-input-type labeling platform. Your design should address:

How to represent heterogeneous inputs and build a multi-stage decision system for task routing, annotator matching, and optional pre-label generation.
The online and offline architecture, including feature stores, model training, batch vs real-time inference, and feedback logging.
How to support cold-start for new customers, new task types, and new annotators while meeting latency and cost constraints.
How to evaluate the system offline and online across quality, throughput, cost, and fairness.
The main failure modes, especially feature drift, training-serving skew, and quality regressions by modality or customer segment.

Constraints

Some customers require data residency and cannot send raw data across regions.
Human-in-the-loop quality is the product: wrong routing increases rework cost and SLA misses.
Raw video/audio inference is expensive; heavy multimodal models cannot run synchronously on every request.
New task schemas appear weekly, and label taxonomies can differ by customer.
The platform must provide auditable assignment decisions for enterprise compliance.

Product Context

Scale

Signal	Value
Enterprise customers	4,000
Daily active annotators	120,000
Tasks created per day	45M
Peak task-ingest QPS	18K
Peak labeling-assignment QPS	35K
Active task backlog	600M items
Input modalities	text, image, audio, video, PDF/document
p99 latency budget for assignment API	250ms

Task

Design an end-to-end ML system for a multi-input-type labeling platform. Your design should address:

How to represent heterogeneous inputs and build a multi-stage decision system for task routing, annotator matching, and optional pre-label generation.
The online and offline architecture, including feature stores, model training, batch vs real-time inference, and feedback logging.
How to support cold-start for new customers, new task types, and new annotators while meeting latency and cost constraints.
How to evaluate the system offline and online across quality, throughput, cost, and fairness.
The main failure modes, especially feature drift, training-serving skew, and quality regressions by modality or customer segment.

Constraints

Some customers require data residency and cannot send raw data across regions.
Human-in-the-loop quality is the product: wrong routing increases rework cost and SLA misses.
Raw video/audio inference is expensive; heavy multimodal models cannot run synchronously on every request.
New task schemas appear weekly, and label taxonomies can differ by customer.
The platform must provide auditable assignment decisions for enterprise compliance.

Product Context

Scale

Signal	Value
Enterprise customers	4,000
Daily active annotators	120,000
Tasks created per day	45M
Peak task-ingest QPS	18K
Peak labeling-assignment QPS	35K
Active task backlog	600M items
Input modalities	text, image, audio, video, PDF/document
p99 latency budget for assignment API	250ms

Task

Design an end-to-end ML system for a multi-input-type labeling platform. Your design should address:

How to represent heterogeneous inputs and build a multi-stage decision system for task routing, annotator matching, and optional pre-label generation.
The online and offline architecture, including feature stores, model training, batch vs real-time inference, and feedback logging.
How to support cold-start for new customers, new task types, and new annotators while meeting latency and cost constraints.
How to evaluate the system offline and online across quality, throughput, cost, and fairness.
The main failure modes, especially feature drift, training-serving skew, and quality regressions by modality or customer segment.

Constraints

Some customers require data residency and cannot send raw data across regions.
Human-in-the-loop quality is the product: wrong routing increases rework cost and SLA misses.
Raw video/audio inference is expensive; heavy multimodal models cannot run synchronously on every request.
New task schemas appear weekly, and label taxonomies can differ by customer.
The platform must provide auditable assignment decisions for enterprise compliance.

Product Context

Scale

Signal	Value
Enterprise customers	4,000
Daily active annotators	120,000
Tasks created per day	45M
Peak task-ingest QPS	18K
Peak labeling-assignment QPS	35K
Active task backlog	600M items
Input modalities	text, image, audio, video, PDF/document
p99 latency budget for assignment API	250ms

Task

Design an end-to-end ML system for a multi-input-type labeling platform. Your design should address:

How to represent heterogeneous inputs and build a multi-stage decision system for task routing, annotator matching, and optional pre-label generation.
The online and offline architecture, including feature stores, model training, batch vs real-time inference, and feedback logging.
How to support cold-start for new customers, new task types, and new annotators while meeting latency and cost constraints.
How to evaluate the system offline and online across quality, throughput, cost, and fairness.
The main failure modes, especially feature drift, training-serving skew, and quality regressions by modality or customer segment.

Constraints

Some customers require data residency and cannot send raw data across regions.
Human-in-the-loop quality is the product: wrong routing increases rework cost and SLA misses.
Raw video/audio inference is expensive; heavy multimodal models cannot run synchronously on every request.
New task schemas appear weekly, and label taxonomies can differ by customer.
The platform must provide auditable assignment decisions for enterprise compliance.

Interview Guides

Product Context

Scale

Task

Constraints

Design Multimodal Data Labeling Router

Product Context

Scale

Task

Constraints

Your Answer

Design Multimodal Data Labeling Router

Product Context

Scale

Task

Constraints

Design Multimodal Data Labeling Router

Product Context

Scale

Task

Constraints

Your Answer