Context
Sparksoft wants to launch an internal knowledge assistant in Sparksoft Workspace that answers employee questions using company documentation such as policies, runbooks, architecture docs, onboarding guides, and support playbooks. The current intranet search is keyword-based and often returns documents, not answers.
Constraints
- p95 end-to-end latency: < 2.5 seconds
- Cost ceiling: < $18K/month at 40K queries/day
- Hallucination ceiling: < 2% on a labeled golden set
- Prompt injection success rate: ~0% on adversarial eval set
- Every factual answer must include grounded citations to retrieved sources
- The system must respect document-level access controls and avoid exposing restricted content
Available Resources
- ~350K internal Sparksoft documents across Confluence exports, PDFs, markdown, and ticket resolutions
- Metadata per document:
doc_id, title, owner_team, last_updated, acl_tags, doc_type
- Sparksoft-approved LLM access via OpenAI or Anthropic APIs
- Sparksoft Search Platform with BM25 and vector retrieval support
- 20 subject-matter experts available to label an evaluation set
Task
Design a production-ready retrieval-augmented generation system for this assistant. Your answer should cover architecture, prompting, safety, and evaluation.
- Define the evaluation plan first: specify offline and online metrics for answer quality, citation faithfulness, refusal behavior, retrieval quality, and prompt injection resistance.
- Design the RAG architecture: ingestion, chunking, embeddings, indexing, retrieval, reranking, generation, and authorization filtering.
- Show how you would reduce hallucinations: prompt design, citation enforcement, refusal behavior, answer verification, and fallback behavior when evidence is weak or missing.
- Explain how you would defend against prompt injection in both user queries and retrieved documents, including detection and mitigation.
- Estimate cost and latency for your design, and describe the tradeoffs you would make if Sparksoft needed to cut cost by 40% without materially increasing hallucinations.