You are building a real-time messaging product where messages, notifications, and suggested conversation surfaces compete for limited attention. The system must decide what to deliver immediately, what to defer, and how to prioritize messages for each user across active chats and devices.
How would you design a messaging system for real-time communication at scale?