You are adding an LLM assistant to a customer-facing support workflow for a financial services product. The assistant drafts responses to inbound customer questions about account status, required documents, payment timing, and policy explanations, and a human agent can either send the draft or edit it. You handle roughly 8,000 conversations per day, and leadership wants faster first-response times without increasing compliance risk. Because customers may act on the assistant's answers, incorrect or overconfident responses are considered high severity.
How would you design this customer-facing LLM workflow so that it is useful but safe, and how would you evaluate and mitigate the major risks before launch and after rollout?