At BuildForge, a test is considered flaky if it has both passed and failed across multiple CI runs. Given a list of test execution records, return the names of all flaky tests in lexicographical order.
Implement a function that scans the records and detects which tests are inconsistent over time.
records, a list of pairs [test_name, status]
test_name is a non-empty stringstatus is either "pass" or "fail"Example 1
Input: records = [["login", "pass"], ["checkout", "fail"], ["login", "fail"], ["search", "pass"]]
Output: ["login"]
login appears once as pass and once as fail, so it is flaky.
Example 2
Input: records = [["a", "pass"], ["b", "fail"], ["a", "pass"], ["b", "fail"]]
Output: []
Neither test changes outcome, so no test is flaky.
1 <= len(records) <= 10^51 <= len(test_name) <= 100"pass" or "fail"