Dataford
Interview Guides
Upgrade
All questions/Pipelines/AI Agent Usage Credits Data Model

AI Agent Usage Credits Data Model

Hard
Pipelines
Asked at 1 company1Data ModelingETL
Also asked at
R

Problem

Context

You’re interviewing with FinPilot, a fintech SaaS platform that provides automated bookkeeping and cash-flow forecasting for ~120K businesses (SMBs + mid-market). FinPilot has launched a new AI Agent feature that can: (1) reconcile transactions, (2) draft invoices, (3) answer “why did my cash drop?” questions, and (4) generate monthly close checklists. The agent is monetized via credits: customers buy credit packs and each agent run consumes credits based on model, tokens, tools used, and runtime.

Today, product analytics is stitched together from application logs and a daily billing export. This has caused multiple incidents: customers disputing invoices because retries double-counted credits, the support team lacking a consistent view of “what happened in this run,” and engineering being blind to rising tool-call error rates that correlate with a specific model rollout. Leadership wants a single source of truth for agent usage that supports: billing accuracy, near-real-time monitoring, and self-serve analytics.

Scale & Freshness Requirements

  • Traffic: ~2.5M agent runs/day average, 10M/day peak during month-end close.
  • Event volume: Each run emits 10–200 events (start, tool calls, token usage, retries, errors, finish). Peak ~50K events/sec.
  • Latency:
    • Operational dashboards (errors, latency, credit burn): P95 < 2 minutes end-to-end.
    • Finance/billing correctness: daily close with T+1 finalization, but must support late-arriving events up to 7 days.
  • Storage: retain raw events 90 days; curated fact tables 2+ years.
  • Correctness: credits must be exactly-once billed per run, even with retries and partial failures.

Data Characteristics & Quality Issues

Events come from multiple producers:

  1. Agent Orchestrator service (Kubernetes): emits lifecycle events (run_started, run_completed, run_failed).
  2. LLM Gateway: emits token usage (prompt_tokens, completion_tokens), model name/version, and request IDs.
  3. Tool execution service: emits tool_call_started/tool_call_finished, tool name, latency, and error codes.
  4. Billing service: emits credit ledger mutations (credit_granted, credit_reserved, credit_charged, credit_refunded).

Common issues you must design for:

  • Retries: same run_id may have multiple attempts; tool calls can be retried with the same idempotency key.
  • Out-of-order delivery: tool events can arrive before run_started due to buffering.
  • Schema evolution: new tools and new models add fields frequently.
  • PII/regulated data: prompts may contain customer financial details; raw prompts must not land in analytics without redaction.

Your Task

Design a complete pipeline + data model to track AI Agent usage, credit consumption, and error rates. Your design must support both near-real-time monitoring and billing-grade accuracy.

Functional Requirements

  1. Canonical event model for agent usage that supports lifecycle, attempts, tool calls, token usage, and errors.
  2. Credit accounting model that ties credit charges/refunds to a run and prevents double billing.
  3. Error-rate analytics by tool, model, customer tier, and agent workflow, with drill-down to run/attempt.
  4. Late-arriving and out-of-order events: correct aggregates when events arrive up to 7 days late.
  5. Idempotent ingestion: duplicates must not inflate usage or credits.
  6. Support backfills for historical reprocessing (e.g., model pricing change, bug fix in token accounting).

Non-Functional Requirements

  1. Auditability: finance can trace an invoice line item back to the raw events and ledger entries.
  2. Data privacy: do not store raw prompts; store hashes/metadata only; enforce least privilege.
  3. High availability: pipeline should tolerate a Kafka broker outage and a streaming job restart without data loss.
  4. Cost control: keep incremental platform spend under $40K/month.

Constraints

  • Cloud: AWS. Warehouse: Snowflake (already used by Finance).
  • Existing tools: Kafka (MSK), Airflow 2.x, dbt, S3.
  • Team: 5 data engineers; strong SQL/dbt skills, moderate Spark experience.
  • Compliance: SOC2; prompts and tool payloads may contain sensitive financial info.

Deliverables (What the interviewer expects)

  • A proposed warehouse schema (facts/dimensions) with keys, grain, and example columns.
  • How you’ll compute:
    • credits consumed per run/day/customer
    • error rates (by tool/model/workflow)
    • token usage and cost attribution
  • How you’ll handle deduplication, retries, late events, and backfills.
  • Monitoring, alerting, and failure recovery strategies.

Problem

Context

You’re interviewing with FinPilot, a fintech SaaS platform that provides automated bookkeeping and cash-flow forecasting for ~120K businesses (SMBs + mid-market). FinPilot has launched a new AI Agent feature that can: (1) reconcile transactions, (2) draft invoices, (3) answer “why did my cash drop?” questions, and (4) generate monthly close checklists. The agent is monetized via credits: customers buy credit packs and each agent run consumes credits based on model, tokens, tools used, and runtime.

Today, product analytics is stitched together from application logs and a daily billing export. This has caused multiple incidents: customers disputing invoices because retries double-counted credits, the support team lacking a consistent view of “what happened in this run,” and engineering being blind to rising tool-call error rates that correlate with a specific model rollout. Leadership wants a single source of truth for agent usage that supports: billing accuracy, near-real-time monitoring, and self-serve analytics.

Scale & Freshness Requirements

  • Traffic: ~2.5M agent runs/day average, 10M/day peak during month-end close.
  • Event volume: Each run emits 10–200 events (start, tool calls, token usage, retries, errors, finish). Peak ~50K events/sec.
  • Latency:
    • Operational dashboards (errors, latency, credit burn): P95 < 2 minutes end-to-end.
    • Finance/billing correctness: daily close with T+1 finalization, but must support late-arriving events up to 7 days.
  • Storage: retain raw events 90 days; curated fact tables 2+ years.
  • Correctness: credits must be exactly-once billed per run, even with retries and partial failures.

Data Characteristics & Quality Issues

Events come from multiple producers:

  1. Agent Orchestrator service (Kubernetes): emits lifecycle events (run_started, run_completed, run_failed).
  2. LLM Gateway: emits token usage (prompt_tokens, completion_tokens), model name/version, and request IDs.
  3. Tool execution service: emits tool_call_started/tool_call_finished, tool name, latency, and error codes.
  4. Billing service: emits credit ledger mutations (credit_granted, credit_reserved, credit_charged, credit_refunded).

Common issues you must design for:

  • Retries: same run_id may have multiple attempts; tool calls can be retried with the same idempotency key.
  • Out-of-order delivery: tool events can arrive before run_started due to buffering.
  • Schema evolution: new tools and new models add fields frequently.
  • PII/regulated data: prompts may contain customer financial details; raw prompts must not land in analytics without redaction.

Your Task

Design a complete pipeline + data model to track AI Agent usage, credit consumption, and error rates. Your design must support both near-real-time monitoring and billing-grade accuracy.

Functional Requirements

  1. Canonical event model for agent usage that supports lifecycle, attempts, tool calls, token usage, and errors.
  2. Credit accounting model that ties credit charges/refunds to a run and prevents double billing.
  3. Error-rate analytics by tool, model, customer tier, and agent workflow, with drill-down to run/attempt.
  4. Late-arriving and out-of-order events: correct aggregates when events arrive up to 7 days late.
  5. Idempotent ingestion: duplicates must not inflate usage or credits.
  6. Support backfills for historical reprocessing (e.g., model pricing change, bug fix in token accounting).

Non-Functional Requirements

  1. Auditability: finance can trace an invoice line item back to the raw events and ledger entries.
  2. Data privacy: do not store raw prompts; store hashes/metadata only; enforce least privilege.
  3. High availability: pipeline should tolerate a Kafka broker outage and a streaming job restart without data loss.
  4. Cost control: keep incremental platform spend under $40K/month.

Constraints

  • Cloud: AWS. Warehouse: Snowflake (already used by Finance).
  • Existing tools: Kafka (MSK), Airflow 2.x, dbt, S3.
  • Team: 5 data engineers; strong SQL/dbt skills, moderate Spark experience.
  • Compliance: SOC2; prompts and tool payloads may contain sensitive financial info.

Deliverables (What the interviewer expects)

  • A proposed warehouse schema (facts/dimensions) with keys, grain, and example columns.
  • How you’ll compute:
    • credits consumed per run/day/customer
    • error rates (by tool/model/workflow)
    • token usage and cost attribution
  • How you’ll handle deduplication, retries, late events, and backfills.
  • Monitoring, alerting, and failure recovery strategies.
Your answer
Try one AI text evaluation on us
Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.
0 wordstarget ~200
Up next
NokiaQuantify Impact of AI Support AgentMediumDefine AI Agent Retention MetricsMediumDatabricksDesign a multi-agent enterprise analytics assistant on Databricks with governance and cost controlsHard
Next question