Design Voya Content Recommendation Platform

Product Context

Voya Financial wants to personalize educational content, plan insights, and next-best actions inside the Voya Retire experience for retirement plan participants. Design an end-to-end ML system that ranks the most relevant cards or recommendations when a user opens the app or web dashboard.

Scale

Signal	Value
Monthly active users	6M
Daily active users	900K
Peak recommendation QPS	2,500
Eligible content/action catalog	1.2M items
New/updated items per day	25K
Homepage recommendations per request	top 10
End-to-end p99 latency budget	180ms

User actions include opening an article, clicking a contribution recommendation, increasing deferral rate, starting a rollover flow, or dismissing a card. Some labels are immediate, while others are delayed by days or weeks.

Task

Design the ML architecture for this recommendation system. Address the following:

Clarify the product objective, prediction target, and key constraints for Voya Retire.
Propose a multi-stage system (retrieval → ranking → re-ranking) and explain why each stage is needed at this scale.
Define the offline and online data pipelines, including feature computation, label generation, and training cadence.
Describe the serving architecture, including online vs batch inference, latency budget allocation, caching, and fallbacks.
Explain how you would evaluate the system offline and online, and how you would monitor drift, training-serving skew, and business guardrails.
Identify major failure modes, especially around stale features, compliance-sensitive recommendations, and cold-start users or items.

Constraints

Recommendations must be explainable enough for internal review and auditability.
Certain actions may be regulated or suitability-sensitive; the system cannot recommend ineligible actions to the wrong user segment.
User profile and account data are privacy-sensitive and cannot be broadly copied into ad hoc systems.
Freshness matters: contribution changes, market events, and newly published education content should be reflected within hours, not days.
Serving cost should stay low enough to support broad rollout across mobile and web surfaces.

Product Context

Scale

Signal	Value
Monthly active users	6M
Daily active users	900K
Peak recommendation QPS	2,500
Eligible content/action catalog	1.2M items
New/updated items per day	25K
Homepage recommendations per request	top 10
End-to-end p99 latency budget	180ms

Task

Design the ML architecture for this recommendation system. Address the following:

Clarify the product objective, prediction target, and key constraints for Voya Retire.
Propose a multi-stage system (retrieval → ranking → re-ranking) and explain why each stage is needed at this scale.
Define the offline and online data pipelines, including feature computation, label generation, and training cadence.
Describe the serving architecture, including online vs batch inference, latency budget allocation, caching, and fallbacks.
Explain how you would evaluate the system offline and online, and how you would monitor drift, training-serving skew, and business guardrails.
Identify major failure modes, especially around stale features, compliance-sensitive recommendations, and cold-start users or items.

Constraints

Recommendations must be explainable enough for internal review and auditability.
Certain actions may be regulated or suitability-sensitive; the system cannot recommend ineligible actions to the wrong user segment.
User profile and account data are privacy-sensitive and cannot be broadly copied into ad hoc systems.
Freshness matters: contribution changes, market events, and newly published education content should be reflected within hours, not days.
Serving cost should stay low enough to support broad rollout across mobile and web surfaces.

Product Context

Scale

Signal	Value
Monthly active users	6M
Daily active users	900K
Peak recommendation QPS	2,500
Eligible content/action catalog	1.2M items
New/updated items per day	25K
Homepage recommendations per request	top 10
End-to-end p99 latency budget	180ms

Task

Design the ML architecture for this recommendation system. Address the following:

Clarify the product objective, prediction target, and key constraints for Voya Retire.
Propose a multi-stage system (retrieval → ranking → re-ranking) and explain why each stage is needed at this scale.
Define the offline and online data pipelines, including feature computation, label generation, and training cadence.
Describe the serving architecture, including online vs batch inference, latency budget allocation, caching, and fallbacks.
Explain how you would evaluate the system offline and online, and how you would monitor drift, training-serving skew, and business guardrails.
Identify major failure modes, especially around stale features, compliance-sensitive recommendations, and cold-start users or items.

Constraints

Recommendations must be explainable enough for internal review and auditability.
Certain actions may be regulated or suitability-sensitive; the system cannot recommend ineligible actions to the wrong user segment.
User profile and account data are privacy-sensitive and cannot be broadly copied into ad hoc systems.
Freshness matters: contribution changes, market events, and newly published education content should be reflected within hours, not days.
Serving cost should stay low enough to support broad rollout across mobile and web surfaces.

Product Context

Scale

Signal	Value
Monthly active users	6M
Daily active users	900K
Peak recommendation QPS	2,500
Eligible content/action catalog	1.2M items
New/updated items per day	25K
Homepage recommendations per request	top 10
End-to-end p99 latency budget	180ms

Task

Design the ML architecture for this recommendation system. Address the following:

Clarify the product objective, prediction target, and key constraints for Voya Retire.
Propose a multi-stage system (retrieval → ranking → re-ranking) and explain why each stage is needed at this scale.
Define the offline and online data pipelines, including feature computation, label generation, and training cadence.
Describe the serving architecture, including online vs batch inference, latency budget allocation, caching, and fallbacks.
Explain how you would evaluate the system offline and online, and how you would monitor drift, training-serving skew, and business guardrails.
Identify major failure modes, especially around stale features, compliance-sensitive recommendations, and cold-start users or items.

Constraints

Recommendations must be explainable enough for internal review and auditability.
Certain actions may be regulated or suitability-sensitive; the system cannot recommend ineligible actions to the wrong user segment.
User profile and account data are privacy-sensitive and cannot be broadly copied into ad hoc systems.
Freshness matters: contribution changes, market events, and newly published education content should be reflected within hours, not days.
Serving cost should stay low enough to support broad rollout across mobile and web surfaces.

Interview Guides

Product Context

Scale

Task

Constraints

Design Voya Content Recommendation Platform

Product Context

Scale

Task

Constraints

Your Answer

Design Voya Content Recommendation Platform

Product Context

Scale

Task

Constraints

Design Voya Content Recommendation Platform

Product Context

Scale

Task

Constraints

Your Answer