Business Context
SoFi wants to improve cross-sell recommendations across products such as SoFi Checking and Savings, SoFi Credit Card, Personal Loans, Invest, and Relay. You are asked to build a recommendation system that ranks the next most relevant product for each active member using historical product adoption and engagement data.
Dataset
You are given a member-product interaction dataset built from 18 months of SoFi activity logs and account snapshots.
| Feature Group | Count | Examples |
|---|
| Member profile | 12 | age_band, income_band, employment_status, tenure_days, state |
| Product holdings | 8 | has_checking, has_invest, has_personal_loan, num_products_owned |
| Engagement | 14 | app_sessions_30d, card_swipes_30d, direct_deposit_flag, relay_linked_accounts |
| Financial behavior | 10 | avg_balance_90d, deposit_trend_30d, credit_score_band, debt_to_income_band |
| Interaction labels | 1 | adopted_target_product_within_30d |
- Rows: 2.4M member-product candidate pairs across 620K members and 5 recommendable SoFi products
- Target: Binary label indicating whether the member adopted the recommended product within 30 days
- Class balance: Highly imbalanced; about 2.7% positive labels
- Missing data: 10-18% missing in income, credit, and linked-account features; some missingness is informative
Success Criteria
A good solution should improve ranking quality enough to support production use in SoFi’s app and CRM surfaces. Target at least PR-AUC > 0.18, Recall@3 > 0.70, and Lift@10% > 3.5 on a holdout period.
Constraints
- Recommendations must be generated nightly for ~600K active members
- Inference should stay under 100 ms per member for online re-ranking
- The solution should be explainable enough for marketing/compliance review
- Cold-start members and newly launched products must be handled explicitly
Deliverables
- Define the recommendation framing (candidate generation + ranking, or direct ranking approach)
- Build a training dataset from member-product pairs without label leakage
- Train and evaluate a model that ranks SoFi products per member
- Describe feature engineering, negative sampling, and cold-start handling
- Propose an offline evaluation plan and a production deployment approach