Product Context
PayFlow is a consumer payments app used for P2P transfers, bill pay, and merchant checkout. The growth team wants an ML system that decides which growth action to show a user at payment time or shortly after: cashback offer, referral prompt, autopay setup, card-on-file incentive, installment offer, or no intervention.
Scale
| Signal | Value |
|---|
| DAU | 28M |
| Monthly active payers | 65M |
| Peak payment-related QPS | 180K |
| Growth-decision QPS | 45K online, plus 120M batch decisions/day |
| Eligible actions | ~40 campaigns/policies active at once |
| User feature count | ~1,200 raw features, ~180 serving features |
| p99 latency budget | 120ms end-to-end, 40ms for modeling stack |
Task
Design an end-to-end ML system for this growth model, with special focus on online versus batch serving.
- Define the prediction target(s), decision surface, and success metrics for a payments growth model.
- Propose an architecture for batch scoring versus real-time scoring, including which use cases belong in each path.
- Design the feature, training, and serving system so that payment-context features are fresh while avoiding training-serving skew.
- Choose models for candidate selection, ranking, and final policy/re-ranking, and explain tradeoffs under the latency budget.
- Define offline evaluation, online experimentation, and monitoring for drift, calibration, and business guardrails.
- Identify likely failure modes in payments, including compliance, stale features, and over-targeting.
Constraints
- Payment authorization flow cannot be blocked; if the model is slow or unavailable, checkout must proceed.
- Some actions are compliance-gated by geography, KYC status, credit eligibility, and merchant category.
- Conversion labels are delayed: some outcomes arrive in-session, others after 7-30 days.
- The system must support both triggered decisions (during/after payment) and precomputed daily audiences for CRM channels.
- Cost target: average inference cost under $0.001 per online decision and under $25 per million batch scores.