A global e-commerce marketplace is onboarding a new region with an unknown alphabet order. Their search index stores product titles in the region’s language, and the titles are already sorted according to the region’s lexicographic rules. To correctly tokenize and normalize text at scale (millions of daily searches), you need to infer a valid character ordering.
You are given words: list[str], a list of lowercase strings sorted lexicographically according to an unknown alien alphabet. Your task is to return a single deterministic alphabet order as a string.
"".'a' < 'b' < ...'z' comparison).For each adjacent pair of words (a, b), find the first position where they differ. If a[i] != b[i], then a[i] must come before b[i] in the alien alphabet (directed edge a[i] -> b[i]). Only the first mismatch matters.
Example 1
words = ["wrt", "wrf", "er", "ett", "rftt"]"wertf"t→f, w→e, r→t, e→r. The only valid order is w,e,r,t,f.Example 2
words = ["ab", "adc"]"abcd"b vs d, so b→d. Characters are {a,b,c,d}. Many orders satisfy b before d, and the lexicographically smallest valid one is "abcd".