At Meta, security systems ingest large server logs from services such as Proxygen and edge infrastructure. Write a function that parses log lines and returns the source IP addresses that appear to be scanning for open ports.
An IP is considered suspicious if, within any sliding time window of window_seconds, it attempts connections to at least threshold distinct destination ports.
Implement:
logs: a list of strings, where each string has the format:"<timestamp> <source_ip> <destination_port>"window_seconds: integer window size in secondsthreshold: integer minimum number of distinct destination ports in the windowReturn a list of suspicious source IPs in lexicographic order.
Each timestamp is a non-negative integer. Multiple log lines may share the same timestamp. If a line is malformed, ignore it.
Example 1
Input:
logs = [
"1 10.0.0.1 22",
"2 10.0.0.1 80",
"3 10.0.0.1 443",
"4 10.0.0.2 22"
], window_seconds = 3, threshold = 3
Output:
["10.0.0.1"]
10.0.0.1 touches 3 distinct ports within timestamps 1..3.
Example 2
Input:
logs = ["1 1.1.1.1 80", "10 1.1.1.1 443", "20 1.1.1.1 8080"], window_seconds = 5, threshold = 2
Output:
[]
No 5-second window contains 2 distinct ports.
1 <= len(logs) <= 2 * 10^51 <= window_seconds <= 10^61 <= threshold <= 655351 <= destination_port <= 65535