
You are the EM for a product engineering team that has had three production incidents in the last quarter. The incidents were resolved quickly, but the same root causes are starting to reappear in adjacent services, and leadership is worried the team is treating postmortems as paperwork instead of learning. You also have a release train coming up, so any new process has to improve reliability without slowing delivery to a crawl.
How do you make sure your team learns from incidents and doesn't repeat the same mistakes?