Discover Growth Opportunities from Intuit Signals

Context

Intuit wants a GenAI-powered analyst assistant that scans user feedback and product signals across TurboTax, Credit Karma, and QuickBooks to uncover new growth opportunities, such as unmet jobs-to-be-done, onboarding friction, or cross-sell moments. The output will be used by Product Growth Analysts, so recommendations must be evidence-backed and auditable.

Constraints

Daily batch generation for leadership review must finish within 45 minutes over 2M text records/day
Interactive drill-down for an analyst must return in <4 seconds p95
Cost ceiling: $25K/month at projected usage of 8K analyst queries/month plus daily batch jobs
Hallucination ceiling: <4% unsupported claims on a labeled golden set
Every recommendation must cite supporting evidence from approved sources only
System must resist prompt injection from user-generated text and must not expose PII from customer feedback

Available Data / Models

12 months of anonymized support chats, app reviews, NPS verbatims, community posts, and call transcripts from TurboTax, Credit Karma, and QuickBooks
Structured product telemetry: funnel steps, drop-off events, plan type, tenure, acquisition channel, and feature adoption
Existing warehouse tables with user segment metadata and experiment history
Approved LLMs from OpenAI or Anthropic, plus an internal vector store and BM25 search
A small team of Growth Analysts available to label a 300-example golden set

Deliverables

Design an eval-first LLM system that identifies and ranks new growth opportunities, including how retrieval, prompting, and structured output work.
Write the core system prompt that forces grounded recommendations with citations, confidence, and refusal behavior when evidence is weak.
Define offline and online evaluation, including how you measure opportunity quality, hallucination, prompt-injection robustness, and analyst usefulness.
Estimate cost and latency for both daily batch processing and interactive analyst queries, and explain key tradeoffs.
List major failure modes and mitigations, especially around unsupported insights, stale evidence, segment bias, and PII leakage.

Context

Constraints

Daily batch generation for leadership review must finish within 45 minutes over 2M text records/day
Interactive drill-down for an analyst must return in <4 seconds p95
Cost ceiling: $25K/month at projected usage of 8K analyst queries/month plus daily batch jobs
Hallucination ceiling: <4% unsupported claims on a labeled golden set
Every recommendation must cite supporting evidence from approved sources only
System must resist prompt injection from user-generated text and must not expose PII from customer feedback

Available Data / Models

12 months of anonymized support chats, app reviews, NPS verbatims, community posts, and call transcripts from TurboTax, Credit Karma, and QuickBooks
Structured product telemetry: funnel steps, drop-off events, plan type, tenure, acquisition channel, and feature adoption
Existing warehouse tables with user segment metadata and experiment history
Approved LLMs from OpenAI or Anthropic, plus an internal vector store and BM25 search
A small team of Growth Analysts available to label a 300-example golden set

Deliverables

Design an eval-first LLM system that identifies and ranks new growth opportunities, including how retrieval, prompting, and structured output work.
Write the core system prompt that forces grounded recommendations with citations, confidence, and refusal behavior when evidence is weak.
Define offline and online evaluation, including how you measure opportunity quality, hallucination, prompt-injection robustness, and analyst usefulness.
Estimate cost and latency for both daily batch processing and interactive analyst queries, and explain key tradeoffs.
List major failure modes and mitigations, especially around unsupported insights, stale evidence, segment bias, and PII leakage.

Context

Constraints

Daily batch generation for leadership review must finish within 45 minutes over 2M text records/day
Interactive drill-down for an analyst must return in <4 seconds p95
Cost ceiling: $25K/month at projected usage of 8K analyst queries/month plus daily batch jobs
Hallucination ceiling: <4% unsupported claims on a labeled golden set
Every recommendation must cite supporting evidence from approved sources only
System must resist prompt injection from user-generated text and must not expose PII from customer feedback

Available Data / Models

12 months of anonymized support chats, app reviews, NPS verbatims, community posts, and call transcripts from TurboTax, Credit Karma, and QuickBooks
Structured product telemetry: funnel steps, drop-off events, plan type, tenure, acquisition channel, and feature adoption
Existing warehouse tables with user segment metadata and experiment history
Approved LLMs from OpenAI or Anthropic, plus an internal vector store and BM25 search
A small team of Growth Analysts available to label a 300-example golden set

Deliverables

Design an eval-first LLM system that identifies and ranks new growth opportunities, including how retrieval, prompting, and structured output work.
Write the core system prompt that forces grounded recommendations with citations, confidence, and refusal behavior when evidence is weak.
Define offline and online evaluation, including how you measure opportunity quality, hallucination, prompt-injection robustness, and analyst usefulness.
Estimate cost and latency for both daily batch processing and interactive analyst queries, and explain key tradeoffs.
List major failure modes and mitigations, especially around unsupported insights, stale evidence, segment bias, and PII leakage.

Context

Constraints

Daily batch generation for leadership review must finish within 45 minutes over 2M text records/day
Interactive drill-down for an analyst must return in <4 seconds p95
Cost ceiling: $25K/month at projected usage of 8K analyst queries/month plus daily batch jobs
Hallucination ceiling: <4% unsupported claims on a labeled golden set
Every recommendation must cite supporting evidence from approved sources only
System must resist prompt injection from user-generated text and must not expose PII from customer feedback

Available Data / Models

12 months of anonymized support chats, app reviews, NPS verbatims, community posts, and call transcripts from TurboTax, Credit Karma, and QuickBooks
Structured product telemetry: funnel steps, drop-off events, plan type, tenure, acquisition channel, and feature adoption
Existing warehouse tables with user segment metadata and experiment history
Approved LLMs from OpenAI or Anthropic, plus an internal vector store and BM25 search
A small team of Growth Analysts available to label a 300-example golden set

Deliverables

Design an eval-first LLM system that identifies and ranks new growth opportunities, including how retrieval, prompting, and structured output work.
Write the core system prompt that forces grounded recommendations with citations, confidence, and refusal behavior when evidence is weak.
Define offline and online evaluation, including how you measure opportunity quality, hallucination, prompt-injection robustness, and analyst usefulness.
Estimate cost and latency for both daily batch processing and interactive analyst queries, and explain key tradeoffs.
List major failure modes and mitigations, especially around unsupported insights, stale evidence, segment bias, and PII leakage.

Interview Guides

Context

Constraints

Available Data / Models

Deliverables

Discover Growth Opportunities from Intuit Signals

Context

Constraints

Available Data / Models

Deliverables

Your Answer

Discover Growth Opportunities from Intuit Signals

Context

Constraints

Available Data / Models

Deliverables

Discover Growth Opportunities from Intuit Signals

Context

Constraints

Available Data / Models

Deliverables

Your Answer