Route Enterprise Queries Across LLMs

Business Context

ApexBank wants a central orchestration layer for enterprise LLM applications used by support, legal, and internal search teams. The platform must route each incoming request to the right model and toolchain while enforcing cost, latency, and compliance constraints.

Data Characteristics

Volume: ~2M historical prompts and responses, plus 80K new requests per day
Input types: user prompts, conversation history, metadata, retrieval context, tool outputs
Text length: 10-2,000 tokens per request; median 220 tokens
Language: English primarily, with 12% multilingual traffic
Labels available: routing target, task type, escalation outcome, user feedback, policy violations
Class distribution: highly imbalanced; common tasks are FAQ/search, rare tasks are legal review and high-risk compliance drafting

Success Criteria

A good solution should achieve at least 90% routing accuracy on known task classes, reduce average inference cost by 25% versus always calling the largest model, and keep p95 end-to-end latency under 2 seconds for standard requests.

Constraints

No sensitive data may leave the approved VPC
All prompts and outputs must be logged for auditability
High-risk requests must be sent to approved models only
The system must degrade gracefully if a model or retrieval service is unavailable

Requirements

Design an NLP pipeline that classifies each request by intent, risk, and orchestration path.
Describe preprocessing for prompts, chat history, metadata, and retrieved documents.
Build a routing model that selects among small, large, and domain-specific LLM backends.
Include fallback logic, policy checks, and retrieval augmentation in the orchestration flow.
Provide a Python implementation for preprocessing, training, inference, and evaluation.
Explain how you would monitor drift, routing quality, latency, and policy failures in production.

Business Context

Data Characteristics

Volume: ~2M historical prompts and responses, plus 80K new requests per day
Input types: user prompts, conversation history, metadata, retrieval context, tool outputs
Text length: 10-2,000 tokens per request; median 220 tokens
Language: English primarily, with 12% multilingual traffic
Labels available: routing target, task type, escalation outcome, user feedback, policy violations
Class distribution: highly imbalanced; common tasks are FAQ/search, rare tasks are legal review and high-risk compliance drafting

Success Criteria

Constraints

No sensitive data may leave the approved VPC
All prompts and outputs must be logged for auditability
High-risk requests must be sent to approved models only
The system must degrade gracefully if a model or retrieval service is unavailable

Requirements

Design an NLP pipeline that classifies each request by intent, risk, and orchestration path.
Describe preprocessing for prompts, chat history, metadata, and retrieved documents.
Build a routing model that selects among small, large, and domain-specific LLM backends.
Include fallback logic, policy checks, and retrieval augmentation in the orchestration flow.
Provide a Python implementation for preprocessing, training, inference, and evaluation.
Explain how you would monitor drift, routing quality, latency, and policy failures in production.

Business Context

Data Characteristics

Volume: ~2M historical prompts and responses, plus 80K new requests per day
Input types: user prompts, conversation history, metadata, retrieval context, tool outputs
Text length: 10-2,000 tokens per request; median 220 tokens
Language: English primarily, with 12% multilingual traffic
Labels available: routing target, task type, escalation outcome, user feedback, policy violations
Class distribution: highly imbalanced; common tasks are FAQ/search, rare tasks are legal review and high-risk compliance drafting

Success Criteria

Constraints

No sensitive data may leave the approved VPC
All prompts and outputs must be logged for auditability
High-risk requests must be sent to approved models only
The system must degrade gracefully if a model or retrieval service is unavailable

Requirements

Design an NLP pipeline that classifies each request by intent, risk, and orchestration path.
Describe preprocessing for prompts, chat history, metadata, and retrieved documents.
Build a routing model that selects among small, large, and domain-specific LLM backends.
Include fallback logic, policy checks, and retrieval augmentation in the orchestration flow.
Provide a Python implementation for preprocessing, training, inference, and evaluation.
Explain how you would monitor drift, routing quality, latency, and policy failures in production.

Business Context

Data Characteristics

Volume: ~2M historical prompts and responses, plus 80K new requests per day
Input types: user prompts, conversation history, metadata, retrieval context, tool outputs
Text length: 10-2,000 tokens per request; median 220 tokens
Language: English primarily, with 12% multilingual traffic
Labels available: routing target, task type, escalation outcome, user feedback, policy violations
Class distribution: highly imbalanced; common tasks are FAQ/search, rare tasks are legal review and high-risk compliance drafting

Success Criteria

Constraints

No sensitive data may leave the approved VPC
All prompts and outputs must be logged for auditability
High-risk requests must be sent to approved models only
The system must degrade gracefully if a model or retrieval service is unavailable

Requirements

Design an NLP pipeline that classifies each request by intent, risk, and orchestration path.
Describe preprocessing for prompts, chat history, metadata, and retrieved documents.
Build a routing model that selects among small, large, and domain-specific LLM backends.
Include fallback logic, policy checks, and retrieval augmentation in the orchestration flow.
Provide a Python implementation for preprocessing, training, inference, and evaluation.
Explain how you would monitor drift, routing quality, latency, and policy failures in production.

Interview Guides

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Route Enterprise Queries Across LLMs

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Your Answer

Route Enterprise Queries Across LLMs

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Route Enterprise Queries Across LLMs

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Your Answer