Context
Hamilton & Reed, a mid-sized law firm, wants an internal assistant that explains a contract-review AI workflow to skeptical practice leaders. The feature should turn a technical workflow description into a concise executive briefing that is accurate, non-hyped, and explicit about human oversight.
Constraints
- p95 latency: 1,500ms for a single explanation request
- Cost ceiling: $8 per 1,000 requests
- Hallucination ceiling: <2% on a 150-prompt golden set
- Must not imply the system gives legal advice or makes autonomous filing decisions
- Must resist prompt injection if the workflow notes contain adversarial text such as "ignore prior instructions"
- Output must be understandable to a non-technical law firm executive in under 250 words
Available Data / Models
- 2,000 internal workflow documents describing intake, OCR, clause extraction, retrieval, human review, audit logging, and escalation paths
- 150 labeled examples of strong vs weak executive explanations
- Approved model access: GPT-4.1-mini or Claude Sonnet class models
- Optional retrieval layer over workflow docs and policy memos
- Internal policy text defining prohibited claims, required disclaimers, and approved terminology
Deliverables
- Design the prompt-based solution that converts a technical AI workflow into an executive-friendly explanation while preserving accuracy and skepticism-aware framing.
- Define an evaluation plan first: offline metrics and online metrics for trust, clarity, and hallucination risk.
- Propose the serving architecture, including whether you would use direct prompting or lightweight RAG over workflow and policy documents.
- Show how you would enforce structured output, refusal behavior, and guardrails against overclaiming, prompt injection, and legal-risky wording.
- Estimate cost and latency, and explain the tradeoffs between a cheaper/faster model and a more reliable one.