You are responsible for a critical university service that authenticates users and serves protected records through an internal API. The service is now unavailable, and you have just been paged because requests are timing out across multiple clients. You also see signs that a recent configuration change may have affected both connectivity and access control.
How would you approach restoring service while making sure you do not widen the blast radius or bypass security controls? What would you check first, how would you decide whether to fail closed or degrade gracefully, and how would you verify the system is safe to bring back up?