A fintech security team monitors tens of millions of authentication events per day across mobile and web. Analysts keep repeating the same triage steps: normalize messy logs, deduplicate near-identical events, and quickly identify the most suspicious source IPs. You’re building a small “custom security tool” function that automates this repetitive workflow.
You are given a list of log lines. Each line is a single string formatted as:
"<timestamp> <ip> <action> <status>"
timestamp is an integer (seconds since epoch)ip is an IPv4 stringaction is a lowercase token (e.g., login, reset, transfer)status is either OK or FAILMultiple lines may be duplicated. Lines may also be semantically identical but differ in whitespace/casing for status (e.g., fail, FAIL, extra spaces). Your tool must normalize and aggregate.
For each IP, compute a suspicion score:
score = 3 * (# of FAIL events) + 1 * (# of distinct actions that had at least one FAIL)
Return the top k IPs by descending score. Break ties by:
FAIL count, thenlogs: list[str], k: intlist[str] of length min(k, number_of_distinct_ips)status case-insensitively (fail, FAIL, Fail all mean FAIL).Example 1
logs = [ "1700000000 10.0.0.1 login FAIL", "1700000001 10.0.0.1 login fail", "1700000002 10.0.0.2 login FAIL", "1700000003 10.0.0.2 reset OK", "1700000004 10.0.0.2 transfer FAIL" ], k = 2['10.0.0.2', '10.0.0.1']10.0.0.1: FAIL=2, distinct failed actions={login} => score=3*2+1=710.0.0.2: FAIL=2, distinct failed actions={login, transfer} => score=3*2+2=8Example 2
logs = [" 170 x.y.z login FAIL ", "1700000000 1.1.1.1 login OK", "1700000001 1.1.1.1 reset FAIL"], k = 5['1.1.1.1']1.1.1.1: FAIL=1, distinct failed actions={reset} => score=3*1+1=41 <= logs.length <= 2 * 10^5<= 2000 <= k <= 10^5action contains no spaces.k = 0, return an empty list.