Product Context
Attentive AI Pro helps marketers generate and personalize SMS and email campaigns. Design the ML system that recommends the next best message variant, offer, or audience treatment for each consumer interaction, while supporting very high write rates from engagement events and campaign updates.
Scale
| Signal | Value |
|---|
| Brands onboarded | 8,000 |
| Consumer profiles | 250M |
| Peak inbound engagement events | 1.2M writes/sec during major retail moments |
| Peak recommendation QPS | 180K requests/sec |
| Active message / offer catalog | 40M active variants |
| New/updated campaign entities | 25M/day |
| End-to-end p99 latency budget | 120ms |
Task
- Clarify product requirements and success metrics for recommending content in Attentive AI Pro.
- Design the end-to-end ML architecture, including retrieval, ranking, and any re-ranking or policy layer.
- Explain how you would shard and replicate the high-write online data plane: user features, campaign state, counters, embeddings, and feedback logs.
- Define the offline and online pipelines, including feature computation, training cadence, and how to avoid training-serving skew.
- Propose evaluation, monitoring, and rollback strategies for model, data, and infrastructure failures.
Constraints
- User interaction features must be fresh within 1-2 minutes for triggered messaging use cases.
- Some campaign-level features are updated extremely frequently during sends, creating hot partitions.
- The system must isolate brands for privacy/compliance while still sharing learnings where allowed.
- Cost matters: most ranking traffic should run on CPU; GPU use should be limited to offline training or narrow online stages.
- If online feature reads degrade, the system must still return safe recommendations rather than block message delivery.