You’re on-call for a fintech payments platform processing millions of card authorizations per day. A React checkout page occasionally crashes, and your error pipeline receives huge volumes of stack traces from different CDNs and builds (cache-busting query strings). To triage quickly, you need to deduplicate traces that are “the same” after normalization.
Each stack trace is a list of frame strings in this format:
<function>@<url>:<line>:<column>
Example frame:
render@https://cdn.app.com/static/js/main.9f3a.js?build=123:120:9
For each frame, convert it to:
<function>@<file_name>:<line>
Where:
file_name).?) before extracting the file name.A, A, A, B → A, B).Implement:
def group_stack_traces(traces: list[list[str]]) -> dict[str, int]:
Return a dictionary mapping a stringified tuple of canonical frames to the number of traces that normalize to that exact canonical sequence.
Note: The key must be the Python
str()of the tuple (e.g.,"('a@app.js:1', 'b@app.js:2')") to match the evaluation harness.
Input
traces = [["render@https://cdn.app.com/static/js/main.9f3a.js?build=123:120:9", "render@https://cdn.app.com/static/js/main.9f3a.js?build=123:120:9", "commitRoot@https://cdn.app.com/static/js/vendor.1a2b.js:88:1"], ["render@https://cdn.app.com/static/js/main.9f3a.js?build=999:120:3", "commitRoot@https://cdn.app.com/static/js/vendor.1a2b.js?x=y:88:99"]]
Output
{"('render@main.9f3a.js:120', 'commitRoot@vendor.1a2b.js:88')": 2}
Explanation
Both traces canonicalize to the same 2-frame sequence; the repeated render frame in the first trace is collapsed.
Input
traces = [["A@https://a.com/app.js:10:1", "B@https://a.com/app.js:11:1"], ["A@https://a.com/app.js:10:9", "B@https://a.com/app.js:12:1"]]
Output
{"('A@app.js:10', 'B@app.js:11')": 1, "('A@app.js:10', 'B@app.js:12')": 1}
Explanation Columns are ignored, but line numbers are kept, so the second frame differs (11 vs 12).
1 <= len(traces) <= 2 * 10^41 <= len(trace) <= 200sum(len(trace) for trace in traces) <= 2 * 10^6'@' and at least two ':' after the URL