You're building a system where a language model needs to answer questions grounded in historical records rather than only its pretraining. The collection includes OCR'd census pages, immigration manifests, city directories, and family tree notes, so retrieval quality and grounded generation both matter.
How would you design a retrieval-augmented generation (RAG) workflow for historical records search?