Design Voya Guidance Recommendation Engine

Product Context

Design an AI architecture for Voya Learn, myVoyage, and related Voya Financial digital surfaces that recommends the next best piece of guidance, educational content, calculator, or product action for retirement plan participants and individual customers. The goal is to improve engagement and help users take relevant financial actions while meeting strict compliance and latency requirements.

Scale

Signal	Value
DAU across digital surfaces	4.5M
Peak recommendation QPS	18K
Monthly active users	16M
Content + action catalog	1.2M items
New/updated items per day	25K
Per-request latency budget (p99)	250ms
Logged user events per day	180M

Task

Clarify the product objective, user segments, and what an "effective AI architecture" means for this recommendation problem.
Design an end-to-end ML system from data ingestion through candidate retrieval, ranking, re-ranking, and serving.
Choose models for each stage and explain tradeoffs across quality, interpretability, freshness, and cost.
Define the online vs batch feature strategy, including how you would handle cold-start users, sparse histories, and newly published content.
Propose an evaluation plan covering offline metrics, online experiments, compliance guardrails, and monitoring for drift and training-serving skew.
Identify likely failure modes and how the system should degrade safely.

Constraints

Recommendations must exclude ineligible products or actions based on plan rules, customer profile, and compliance policies.
Some user features (income, contribution rate, risk profile) are sensitive and require strict access control and auditability.
Freshness matters: newly published guidance content should become eligible within 15 minutes.
The system must support both personalized logged-in traffic and partially personalized sessions with limited identity.
Cost matters: GPU-heavy serving is allowed only for a small final-stage model, not broad candidate scoring.

Product Context

Scale

Signal	Value
DAU across digital surfaces	4.5M
Peak recommendation QPS	18K
Monthly active users	16M
Content + action catalog	1.2M items
New/updated items per day	25K
Per-request latency budget (p99)	250ms
Logged user events per day	180M

Task

Clarify the product objective, user segments, and what an "effective AI architecture" means for this recommendation problem.
Design an end-to-end ML system from data ingestion through candidate retrieval, ranking, re-ranking, and serving.
Choose models for each stage and explain tradeoffs across quality, interpretability, freshness, and cost.
Define the online vs batch feature strategy, including how you would handle cold-start users, sparse histories, and newly published content.
Propose an evaluation plan covering offline metrics, online experiments, compliance guardrails, and monitoring for drift and training-serving skew.
Identify likely failure modes and how the system should degrade safely.

Constraints

Recommendations must exclude ineligible products or actions based on plan rules, customer profile, and compliance policies.
Some user features (income, contribution rate, risk profile) are sensitive and require strict access control and auditability.
Freshness matters: newly published guidance content should become eligible within 15 minutes.
The system must support both personalized logged-in traffic and partially personalized sessions with limited identity.
Cost matters: GPU-heavy serving is allowed only for a small final-stage model, not broad candidate scoring.

Product Context

Scale

Signal	Value
DAU across digital surfaces	4.5M
Peak recommendation QPS	18K
Monthly active users	16M
Content + action catalog	1.2M items
New/updated items per day	25K
Per-request latency budget (p99)	250ms
Logged user events per day	180M

Task

Clarify the product objective, user segments, and what an "effective AI architecture" means for this recommendation problem.
Design an end-to-end ML system from data ingestion through candidate retrieval, ranking, re-ranking, and serving.
Choose models for each stage and explain tradeoffs across quality, interpretability, freshness, and cost.
Define the online vs batch feature strategy, including how you would handle cold-start users, sparse histories, and newly published content.
Propose an evaluation plan covering offline metrics, online experiments, compliance guardrails, and monitoring for drift and training-serving skew.
Identify likely failure modes and how the system should degrade safely.

Constraints

Recommendations must exclude ineligible products or actions based on plan rules, customer profile, and compliance policies.
Some user features (income, contribution rate, risk profile) are sensitive and require strict access control and auditability.
Freshness matters: newly published guidance content should become eligible within 15 minutes.
The system must support both personalized logged-in traffic and partially personalized sessions with limited identity.
Cost matters: GPU-heavy serving is allowed only for a small final-stage model, not broad candidate scoring.

Product Context

Scale

Signal	Value
DAU across digital surfaces	4.5M
Peak recommendation QPS	18K
Monthly active users	16M
Content + action catalog	1.2M items
New/updated items per day	25K
Per-request latency budget (p99)	250ms
Logged user events per day	180M

Task

Clarify the product objective, user segments, and what an "effective AI architecture" means for this recommendation problem.
Design an end-to-end ML system from data ingestion through candidate retrieval, ranking, re-ranking, and serving.
Choose models for each stage and explain tradeoffs across quality, interpretability, freshness, and cost.
Define the online vs batch feature strategy, including how you would handle cold-start users, sparse histories, and newly published content.
Propose an evaluation plan covering offline metrics, online experiments, compliance guardrails, and monitoring for drift and training-serving skew.
Identify likely failure modes and how the system should degrade safely.

Constraints

Recommendations must exclude ineligible products or actions based on plan rules, customer profile, and compliance policies.
Some user features (income, contribution rate, risk profile) are sensitive and require strict access control and auditability.
Freshness matters: newly published guidance content should become eligible within 15 minutes.
The system must support both personalized logged-in traffic and partially personalized sessions with limited identity.
Cost matters: GPU-heavy serving is allowed only for a small final-stage model, not broad candidate scoring.

Interview Guides

Product Context

Scale

Task

Constraints

Design Voya Guidance Recommendation Engine

Product Context

Scale

Task

Constraints

Your Answer

Design Voya Guidance Recommendation Engine

Product Context

Scale

Task

Constraints

Design Voya Guidance Recommendation Engine

Product Context

Scale

Task

Constraints

Your Answer