Interview Guides

Interview Guides

Implement a Minimal MLflow Agent Evaluation Runner and Explain Why Databricks | Dataford Interview Questions - Dataford - Ace your Interview

All questions/Coding/Implement a Minimal MLflow Agent Evaluation Runner and Explain Why Databricks

Implement a Minimal MLflow Agent Evaluation Runner and Explain Why Databricks

Easy

Coding

Asked at 1 company1

Also asked at

Databricks

Problem

Write a small evaluation runner in Python that executes an agent over a dataset of prompts, captures outputs and latency, computes simple task metrics (for example exact-match or rubric-based pass/fail), and logs per-example plus aggregate results to MLflow. The runner should be modular so that the same interface could later support additional metrics such as faithfulness, groundedness, or LLM-as-Judge. After the coding portion, briefly explain in comments or docstring why Databricks is a strong platform for this workflow, referencing Mosaic AI, unified governance via Unity Catalog, model serving, and agentic system evaluation. Expected solution outline: define an evaluator loop, metric function registry, MLflow logging structure, robust exception handling, and concise reasoning about the Databricks platform advantage for GenAI productionization.

Problem

Write a small evaluation runner in Python that executes an agent over a dataset of prompts, captures outputs and latency, computes simple task metrics (for example exact-match or rubric-based pass/fail), and logs per-example plus aggregate results to MLflow. The runner should be modular so that the same interface could later support additional metrics such as faithfulness, groundedness, or LLM-as-Judge. After the coding portion, briefly explain in comments or docstring why Databricks is a strong platform for this workflow, referencing Mosaic AI, unified governance via Unity Catalog, model serving, and agentic system evaluation. Expected solution outline: define an evaluator loop, metric function registry, MLflow logging structure, robust exception handling, and concise reasoning about the Databricks platform advantage for GenAI productionization.

Your answer

Try one AI text evaluation on us

Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.

0 wordstarget ~200

Up next

Databricks

Design a Databricks-native RAG support agent with offline eval and online SLOsMedium

Databricks

Launch Databricks Support RAG AgentMedium

Databricks

Design a multi-agent enterprise analytics assistant on Databricks with governance and cost controlsHard