You are building a support assistant for an internal operations team that answers questions about policies, workflows, and product procedures using only internal documentation. The assistant will serve a few hundred support agents and handle roughly 8,000 questions per day across a corpus of policy docs, SOPs, and knowledge-base articles. Existing keyword search is slow and often returns outdated pages, so the team wants a grounded assistant that can answer directly and cite sources. Trust matters more than coverage: unsupported answers are worse than refusals.
How would you design the prompt and surrounding retrieval flow so the assistant answers only from internal documents, refuses when evidence is missing, and remains reliable under latency, cost, and safety constraints? Explain how you would evaluate the system before launch and monitor it in production.