You own a backend service that handles user-triggered actions and writes data to internal systems before returning a response. Traffic has grown, tail latency is rising, and some requests now fan out to slower downstream dependencies. The team is debating whether to keep the flow synchronous or move parts of it to an asynchronous pipeline, but you also need to preserve security controls, auditability, and safe failure handling.
How would you decide which parts of the workflow should remain synchronous versus asynchronous when building a scalable service, and how would you design the system so that security, authorization, and operational safety still hold under retries, queue backlogs, and partial failures?
Trade-offs between synchronous request/response and asynchronous processingQueue-based architecture and worker isolationIdempotency and replay-safe designSecurity controls across auth, authz, logging, and failure handling