Product Context
Voya Financial wants to personalize educational content, plan insights, and next-best actions inside the Voya Retire experience for retirement plan participants. Design an end-to-end ML system that ranks the most relevant cards or recommendations when a user opens the app or web dashboard.
Scale
| Signal | Value |
|---|
| Monthly active users | 6M |
| Daily active users | 900K |
| Peak recommendation QPS | 2,500 |
| Eligible content/action catalog | 1.2M items |
| New/updated items per day | 25K |
| Homepage recommendations per request | top 10 |
| End-to-end p99 latency budget | 180ms |
User actions include opening an article, clicking a contribution recommendation, increasing deferral rate, starting a rollover flow, or dismissing a card. Some labels are immediate, while others are delayed by days or weeks.
Task
Design the ML architecture for this recommendation system. Address the following:
- Clarify the product objective, prediction target, and key constraints for Voya Retire.
- Propose a multi-stage system (retrieval → ranking → re-ranking) and explain why each stage is needed at this scale.
- Define the offline and online data pipelines, including feature computation, label generation, and training cadence.
- Describe the serving architecture, including online vs batch inference, latency budget allocation, caching, and fallbacks.
- Explain how you would evaluate the system offline and online, and how you would monitor drift, training-serving skew, and business guardrails.
- Identify major failure modes, especially around stale features, compliance-sensitive recommendations, and cold-start users or items.
Constraints
- Recommendations must be explainable enough for internal review and auditability.
- Certain actions may be regulated or suitability-sensitive; the system cannot recommend ineligible actions to the wrong user segment.
- User profile and account data are privacy-sensitive and cannot be broadly copied into ad hoc systems.
- Freshness matters: contribution changes, market events, and newly published education content should be reflected within hours, not days.
- Serving cost should stay low enough to support broad rollout across mobile and web surfaces.