Your RAG system is retrieving the right documents, but the model still produces confident answers that are not supported by those documents. You need a plan to reduce hallucinations without making the system uselessly conservative.
What would you change across prompting, generation, verification, and evaluation to make the answers more faithful to the retrieved context?
RAG failure analysis after retrieval is already workingHallucination reduction through prompting and verificationLLM evaluation design for faithfulness and abstentionPrompt injection handling in retrieved documents