
In a Databricks AI pipeline, mlflow.evaluate and LLM-as-Judge outputs may attach free-form issue labels such as faithfulness, groundedness, or prompt injection to each response. Write idiomatic Python to normalize these labels into a clean, deterministic list suitable for downstream use in Mosaic AI and Model Serving logs.
Implement a function that takes a list of raw label strings and returns a sorted list of unique normalized labels.
labels, a list of strings.Example 1
Input: labels = [" Faithfulness ", "groundedness", "Prompt Injection", "prompt_injection"]
Output: ["faithfulness", "groundedness", "prompt-injection"]
Explanation: Prompt Injection and prompt_injection normalize to the same canonical label.
Example 2
Input: labels = ["DBRX", "dbrx!!", "", " ", "model-serving"]
Output: ["dbrx", "model-serving"]
Explanation: Empty values are ignored, punctuation is removed, and duplicates are collapsed.
1 <= len(labels) <= 10^40 <= len(labels[i]) <= 200