Problem Narrative
You’re building a search deduplication service for a large e-commerce marketplace (10M+ daily searches). Users often misspell or reorder characters when searching for SKUs and promo codes (e.g., "SAVE10" vs "10EVAS"). To reduce redundant indexing and improve cache hit rates, you need a fast function that determines whether two strings are anagrams under strict normalization rules.
Task
Implement a function:
- Function:
are_anagrams(s: str, t: str) -> bool
Return True if s and t are anagrams after applying the following rules:
- Case-insensitive: treat uppercase and lowercase letters as the same.
- Ignore non-alphanumeric characters: spaces, punctuation, and symbols should be skipped.
- Count-based: strings are anagrams if they contain the same characters with the same frequencies after normalization.
Input/Output
- Input: two strings
s and t
- Output: boolean indicating whether they are anagrams under the rules above
Examples
Example 1
- Input:
s = "Dormitory", t = "Dirty room!!"
- Output:
True
- Explanation:
- Normalize:
"dormitory" and "dirtyroom"
- Both have identical character counts → anagrams.
Example 2
- Input:
s = "promo-code-2026", t = "2026 code promo"
- Output:
True
- Explanation:
- Normalize:
"promocode2026" and "2026codepromo"
- Same multiset of characters → anagrams.
Constraints
0 <= len(s), len(t) <= 2 * 10^5
- Strings may contain any ASCII characters
Notes / Clarifications
- Two empty strings are anagrams.
- If, after normalization, the lengths differ, you can immediately return
False.
- Aim for linear time to support high QPS traffic in production services.