You are responsible for the design of a multi-tenant inference platform that exposes APIs, schedules jobs onto accelerator-backed clusters, stores model artifacts, and provides an admin control plane for operators. The platform handles customer prompts, model inputs and outputs, API credentials, and internal service identities. A recent internal review found that teams describe security goals at a high level, but designs do not consistently define trust boundaries, abuse cases, or how controls would be verified in production.
What security considerations would you make in your design for this platform? Walk through how you would structure the architecture, prioritize threats, choose concrete controls, and prove that the controls and detections actually work when something goes wrong.