Context
FinEdge is building a sales-assist copilot for account executives. One feature drafts short customer-facing explanations of technical concepts, including the difference between generative AI and traditional machine learning, tailored to non-technical buyers.
Constraints
- p95 latency: 1,200ms per response
- Cost ceiling: $6K/month at 100K requests/month
- Hallucination ceiling: <2% on a 200-prompt golden set
- Tone must be business-friendly, accurate, and avoid overclaiming capabilities
- Must refuse or hedge when asked for unsupported ROI, legal, or compliance claims
- Output must be structured so downstream UI can render:
audience, answer, bullets, risks, cta
Available Resources
- 1,500 historical sales-engineering responses labeled as strong / weak
- Product-approved messaging guide with definitions, approved claims, and banned phrases
- A small taxonomy of customer personas: CIO, Head of Data, Operations Lead, SMB Owner
- Access to a GPT-4-class or Claude-class model via API
- 200 evaluation prompts covering simple asks, adversarial asks, and requests containing false assumptions
Task
- Design a prompt-based solution that explains the difference between generative AI and traditional machine learning to a customer, while adapting tone and depth by persona.
- Define an evaluation plan before architecture: how you will measure factual accuracy, clarity, refusal quality, hallucination rate, and consistency with approved messaging.
- Propose the runtime architecture, including prompt construction, structured output validation, fallback behavior, and monitoring.
- Estimate cost and latency at target volume, and describe optimizations if the first design misses either budget.
- Identify likely failure modes such as hallucinated business claims, prompt injection through user input, and invalid structured output, with mitigations.