

"Tell me about a time you found a scaling bottleneck in a production system that was already affecting users or business operations. How did you decide what to fix first, communicate trade-offs, and drive the system to a more stable state when the root cause or long-term path wasn't fully clear?"