In a Mosaic AI evaluation workflow, you receive a list of examples where each example contains question, retrieved_passages, and model_answer. Write code that computes two heuristic metrics for each example: faithfulness and groundedness. Faithfulness should estimate whether the answer's claims are supported by the retrieved passages; groundedness should estimate how much of the answer is directly attributable to the retrieved context rather than unsupported additions. You may assume simple tokenization and sentence splitting, and you should design the scoring logic to run in batch over many examples. Expected solution outline: define sentence- or claim-level matching against retrieved passages, normalize text, compute overlap/coverage-based scores, discuss limitations versus LLM-based evaluators, and explain how these metrics would be logged into MLflow Agent Evaluation for downstream analysis.