Product Context
Pulse is a consumer subscription app with a growth team that scores users in real time for interventions such as upgrade prompts, referral offers, win-back notifications, and paywall variants. The core challenge is designing a feature store that serves low-latency, fresh, and training-consistent features to multiple online growth models.
Scale
| Signal | Value |
|---|
| DAU | 45M |
| MAU | 180M |
| Peak scoring QPS | 220K predictions/sec |
| Event ingest rate | 9M events/min |
| User entities | 180M |
| Item / offer entities | 25K active offers / experiments |
| Feature freshness target | < 60s for behavioral counters |
| End-to-end scoring latency budget (p99) | 80ms |
The growth platform currently runs models for conversion propensity, churn risk, notification send-time optimization, and offer ranking. Product and marketing teams want one shared feature platform instead of bespoke pipelines per model.
Task
- Define the functional and non-functional requirements for a real-time feature store supporting growth ML use cases.
- Design the end-to-end architecture, including event ingestion, feature computation, online/offline storage, training data generation, and online serving.
- Explain how the feature store integrates with a multi-stage decision system for growth actions (eligibility/filtering → scoring/ranking → policy constraints).
- Propose model and feature patterns that work well for this setup, including handling sparse users, cold start, and delayed labels.
- Describe evaluation, monitoring, and rollback strategies, with explicit treatment of feature drift, training-serving skew, and data quality failures.
Constraints
- User-level features may include PII-derived attributes, so the system must support deletion requests and access controls.
- Some features must update in near real time; others can be batch-computed hourly or daily to control cost.
- The online path must degrade gracefully if the feature store or stream processor is delayed.
- Teams need point-in-time correct offline features for training and backfills.
- Cost target: keep feature serving + scoring under $0.0015 per prediction at peak load.