Define AI Agent Retention Metrics

Business Context

You’re the analytics lead for NovaDesk, a B2B SaaS customer support platform serving 6,000 companies and ~1.8M end-customer conversations/day. NovaDesk recently launched an AI support agent that can autonomously resolve tickets (refunds, password resets, delivery status) and escalate to humans when needed. The AI agent is priced as an add-on: $0.65 per successful resolution or $49/seat/month for “agent-assisted” workflows.

After a strong launch, leadership is worried about retention: many customers try the agent for a week and then revert to human-only workflows. The Head of Product asks you to define “AI agent retention” in a way that is (a) meaningful for revenue, (b) resistant to gaming (e.g., forcing the agent to touch every ticket), and (c) actionable for product and ML teams.

Metric Scenario

In the last 8 weeks:

Trial starts increased from 220 → 410 orgs/week.
But Week-4 logo retention for the agent add-on appears to have dropped from 52% → 41%.
At the same time, the ML team shipped a new policy that increases “agent autonomy” (fewer human escalations), and Support Ops changed default routing rules so the agent sees more low-complexity tickets.

Stakeholders disagree on what “retained” means:

Sales wants “retained” to mean the customer is still paying.
Product wants “retained” to mean the agent is used regularly.
ML wants “retained” to mean the agent is performing well enough that customers trust it.

You have one week to propose a retention measurement framework to be used in the next QBR, and it must work for both self-serve SMB and enterprise customers with custom workflows.

Available Data

Source	What it captures	Grain
`agent_conversations`	Each conversation handled by AI/human, timestamps, outcome	conversation_id
`agent_actions`	Tool calls (refund API, order lookup), policy version, latency	action_id
`org_billing`	Plan, add-on status, invoices, credits, churn dates	org_id, invoice
`routing_rules_audit`	Routing configuration changes over time	org_id, change_event
`human_handoff`	Escalations, reasons, time-to-human	handoff_event
`csat_surveys`	End-user CSAT and comments	conversation_id

Your Task (what you must produce)

Define “AI agent retention” with at least two complementary retention metrics:
- one that reflects commercial retention (revenue/logo)
- one that reflects product retention (repeat usage / habit)
- optionally a third that reflects trust/quality retention
Specify:
- the unit of retention (org, admin user, agent instance, end-customer)
- the return event (what counts as “using the agent again”)
- the time window (D1/D7/W4/90-day) and why
Provide a calculation approach including cohorting rules and edge cases:
- orgs with seasonal volume
- orgs that route only a subset of ticket types to the agent
- “accidental usage” (agent touched a ticket but immediately handed off)
Decompose retention into actionable drivers (product, routing, model quality, operations), and list the top hypotheses for the observed W4 drop.
Propose guardrails to ensure retention improvements don’t come from harmful behavior (e.g., higher deflection but worse CSAT, or cost blowups from tool calls).
Recommend 3–5 actions you’d take if your decomposition shows retention is primarily driven by (a) poor outcomes, (b) latency/reliability, or (c) misconfigured routing.

Business Context

Metric Scenario

In the last 8 weeks:

Trial starts increased from 220 → 410 orgs/week.
But Week-4 logo retention for the agent add-on appears to have dropped from 52% → 41%.
At the same time, the ML team shipped a new policy that increases “agent autonomy” (fewer human escalations), and Support Ops changed default routing rules so the agent sees more low-complexity tickets.

Stakeholders disagree on what “retained” means:

Sales wants “retained” to mean the customer is still paying.
Product wants “retained” to mean the agent is used regularly.
ML wants “retained” to mean the agent is performing well enough that customers trust it.

You have one week to propose a retention measurement framework to be used in the next QBR, and it must work for both self-serve SMB and enterprise customers with custom workflows.

Available Data

Source	What it captures	Grain
`agent_conversations`	Each conversation handled by AI/human, timestamps, outcome	conversation_id
`agent_actions`	Tool calls (refund API, order lookup), policy version, latency	action_id
`org_billing`	Plan, add-on status, invoices, credits, churn dates	org_id, invoice
`routing_rules_audit`	Routing configuration changes over time	org_id, change_event
`human_handoff`	Escalations, reasons, time-to-human	handoff_event
`csat_surveys`	End-user CSAT and comments	conversation_id

Your Task (what you must produce)

Define “AI agent retention” with at least two complementary retention metrics:
- one that reflects commercial retention (revenue/logo)
- one that reflects product retention (repeat usage / habit)
- optionally a third that reflects trust/quality retention
Specify:
- the unit of retention (org, admin user, agent instance, end-customer)
- the return event (what counts as “using the agent again”)
- the time window (D1/D7/W4/90-day) and why
Provide a calculation approach including cohorting rules and edge cases:
- orgs with seasonal volume
- orgs that route only a subset of ticket types to the agent
- “accidental usage” (agent touched a ticket but immediately handed off)
Decompose retention into actionable drivers (product, routing, model quality, operations), and list the top hypotheses for the observed W4 drop.
Propose guardrails to ensure retention improvements don’t come from harmful behavior (e.g., higher deflection but worse CSAT, or cost blowups from tool calls).
Recommend 3–5 actions you’d take if your decomposition shows retention is primarily driven by (a) poor outcomes, (b) latency/reliability, or (c) misconfigured routing.

Business Context

Metric Scenario

In the last 8 weeks:

Trial starts increased from 220 → 410 orgs/week.
But Week-4 logo retention for the agent add-on appears to have dropped from 52% → 41%.
At the same time, the ML team shipped a new policy that increases “agent autonomy” (fewer human escalations), and Support Ops changed default routing rules so the agent sees more low-complexity tickets.

Stakeholders disagree on what “retained” means:

Sales wants “retained” to mean the customer is still paying.
Product wants “retained” to mean the agent is used regularly.
ML wants “retained” to mean the agent is performing well enough that customers trust it.

You have one week to propose a retention measurement framework to be used in the next QBR, and it must work for both self-serve SMB and enterprise customers with custom workflows.

Available Data

Source	What it captures	Grain
`agent_conversations`	Each conversation handled by AI/human, timestamps, outcome	conversation_id
`agent_actions`	Tool calls (refund API, order lookup), policy version, latency	action_id
`org_billing`	Plan, add-on status, invoices, credits, churn dates	org_id, invoice
`routing_rules_audit`	Routing configuration changes over time	org_id, change_event
`human_handoff`	Escalations, reasons, time-to-human	handoff_event
`csat_surveys`	End-user CSAT and comments	conversation_id

Your Task (what you must produce)

Define “AI agent retention” with at least two complementary retention metrics:
- one that reflects commercial retention (revenue/logo)
- one that reflects product retention (repeat usage / habit)
- optionally a third that reflects trust/quality retention
Specify:
- the unit of retention (org, admin user, agent instance, end-customer)
- the return event (what counts as “using the agent again”)
- the time window (D1/D7/W4/90-day) and why
Provide a calculation approach including cohorting rules and edge cases:
- orgs with seasonal volume
- orgs that route only a subset of ticket types to the agent
- “accidental usage” (agent touched a ticket but immediately handed off)
Decompose retention into actionable drivers (product, routing, model quality, operations), and list the top hypotheses for the observed W4 drop.
Propose guardrails to ensure retention improvements don’t come from harmful behavior (e.g., higher deflection but worse CSAT, or cost blowups from tool calls).
Recommend 3–5 actions you’d take if your decomposition shows retention is primarily driven by (a) poor outcomes, (b) latency/reliability, or (c) misconfigured routing.

Business Context

Metric Scenario

In the last 8 weeks:

Trial starts increased from 220 → 410 orgs/week.
But Week-4 logo retention for the agent add-on appears to have dropped from 52% → 41%.
At the same time, the ML team shipped a new policy that increases “agent autonomy” (fewer human escalations), and Support Ops changed default routing rules so the agent sees more low-complexity tickets.

Stakeholders disagree on what “retained” means:

Sales wants “retained” to mean the customer is still paying.
Product wants “retained” to mean the agent is used regularly.
ML wants “retained” to mean the agent is performing well enough that customers trust it.

You have one week to propose a retention measurement framework to be used in the next QBR, and it must work for both self-serve SMB and enterprise customers with custom workflows.

Available Data

Source	What it captures	Grain
`agent_conversations`	Each conversation handled by AI/human, timestamps, outcome	conversation_id
`agent_actions`	Tool calls (refund API, order lookup), policy version, latency	action_id
`org_billing`	Plan, add-on status, invoices, credits, churn dates	org_id, invoice
`routing_rules_audit`	Routing configuration changes over time	org_id, change_event
`human_handoff`	Escalations, reasons, time-to-human	handoff_event
`csat_surveys`	End-user CSAT and comments	conversation_id

Your Task (what you must produce)

Define “AI agent retention” with at least two complementary retention metrics:
- one that reflects commercial retention (revenue/logo)
- one that reflects product retention (repeat usage / habit)
- optionally a third that reflects trust/quality retention
Specify:
- the unit of retention (org, admin user, agent instance, end-customer)
- the return event (what counts as “using the agent again”)
- the time window (D1/D7/W4/90-day) and why
Provide a calculation approach including cohorting rules and edge cases:
- orgs with seasonal volume
- orgs that route only a subset of ticket types to the agent
- “accidental usage” (agent touched a ticket but immediately handed off)
Decompose retention into actionable drivers (product, routing, model quality, operations), and list the top hypotheses for the observed W4 drop.
Propose guardrails to ensure retention improvements don’t come from harmful behavior (e.g., higher deflection but worse CSAT, or cost blowups from tool calls).
Recommend 3–5 actions you’d take if your decomposition shows retention is primarily driven by (a) poor outcomes, (b) latency/reliability, or (c) misconfigured routing.

Interview Guides

Business Context

Metric Scenario

Available Data

Your Task (what you must produce)

Define AI Agent Retention Metrics

Business Context

Metric Scenario

Available Data

Your Task (what you must produce)

Your Answer

Define AI Agent Retention Metrics

Business Context

Metric Scenario

Available Data

Your Task (what you must produce)

Define AI Agent Retention Metrics

Business Context

Metric Scenario

Available Data

Your Task (what you must produce)

Your Answer