Detect Configuration Drift Events

Problem Narrative

You’re on the SRE team at a fintech payments processor where 5,000 Linux servers handle card authorizations. A single unintended config change (e.g., TLS ciphers, firewall rules, kernel params) can cause outages or compliance violations. Your monitoring agent streams configuration snapshots to a central service, and you must continuously detect and report configuration drift.

Formal Problem Statement

You are given a stream of snapshot events. Each event is a tuple:

timestamp (int): seconds since epoch
server_id (int)
config (dict[str, str]): key/value configuration pairs

You are also given:

baseline (dict[str, str]): the desired configuration
k (int): the number of most drifted servers to report after processing all events

A server is considered drifted at time t if its latest snapshot at or before t differs from the baseline.

Define the drift score of a server as the number of keys where the latest config differs from baseline, counting:

Keys present in baseline but missing in server config (treated as drift).
Keys present in server config but not in baseline (treated as drift).
Keys present in both but with different values (treated as drift).

Task

Implement a function that processes the events in chronological order (events may be unsorted; you must handle this) and returns the k server IDs with the highest drift score after all events are applied.

Tie-breaking: If multiple servers have the same drift score, return the smaller server_id first.

Output

Return a list of server IDs of length min(k, number_of_seen_servers), sorted by:

Drift score descending
server_id ascending

Examples

Example 1

Input:

baseline = {"a":"1","b":"2"}
events = [(10, 1, {"a":"1","b":"9"}), (11, 2, {"a":"0"})]
k = 2

Output: [2, 1]

Explanation:

Server 1 differs on b only → score 1.
Server 2 differs on a (0 vs 1) and is missing b → score 2.

Example 2

Input:

baseline = {"x":"on"}
events = [(5, 7, {"x":"on","extra":"1"}), (3, 7, {"x":"off"})]
k = 1

Output: [7]

Explanation: Events are applied by timestamp: at t=3 config is {x:off} then at t=5 config is {x:on, extra:1}. Final drift is only the extra key → score 1.

Constraints

1 <= len(events) <= 2 * 10^5
1 <= server_id <= 5000
1 <= len(baseline) <= 10^4
Each config has up to 10^4 keys
Total number of key/value pairs across all events <= 2 * 10^5
1 <= k <= 5000

Notes / Clarifications

Only the latest snapshot per server matters in the final answer.
You should not compare every server to baseline on every event; design for scale.
Values are case-sensitive strings; keys are unique within a config.

Problem Narrative

Formal Problem Statement

You are given a stream of snapshot events. Each event is a tuple:

timestamp (int): seconds since epoch
server_id (int)
config (dict[str, str]): key/value configuration pairs

You are also given:

baseline (dict[str, str]): the desired configuration
k (int): the number of most drifted servers to report after processing all events

A server is considered drifted at time t if its latest snapshot at or before t differs from the baseline.

Define the drift score of a server as the number of keys where the latest config differs from baseline, counting:

Keys present in baseline but missing in server config (treated as drift).
Keys present in server config but not in baseline (treated as drift).
Keys present in both but with different values (treated as drift).

Task

Tie-breaking: If multiple servers have the same drift score, return the smaller server_id first.

Output

Return a list of server IDs of length min(k, number_of_seen_servers), sorted by:

Drift score descending
server_id ascending

Examples

Example 1

Input:

baseline = {"a":"1","b":"2"}
events = [(10, 1, {"a":"1","b":"9"}), (11, 2, {"a":"0"})]
k = 2

Output: [2, 1]

Explanation:

Server 1 differs on b only → score 1.
Server 2 differs on a (0 vs 1) and is missing b → score 2.

Example 2

Input:

baseline = {"x":"on"}
events = [(5, 7, {"x":"on","extra":"1"}), (3, 7, {"x":"off"})]
k = 1

Output: [7]

Explanation: Events are applied by timestamp: at t=3 config is {x:off} then at t=5 config is {x:on, extra:1}. Final drift is only the extra key → score 1.

Constraints

1 <= len(events) <= 2 * 10^5
1 <= server_id <= 5000
1 <= len(baseline) <= 10^4
Each config has up to 10^4 keys
Total number of key/value pairs across all events <= 2 * 10^5
1 <= k <= 5000

Notes / Clarifications

Only the latest snapshot per server matters in the final answer.
You should not compare every server to baseline on every event; design for scale.
Values are case-sensitive strings; keys are unique within a config.

Problem Narrative

Formal Problem Statement

You are given a stream of snapshot events. Each event is a tuple:

timestamp (int): seconds since epoch
server_id (int)
config (dict[str, str]): key/value configuration pairs

You are also given:

baseline (dict[str, str]): the desired configuration
k (int): the number of most drifted servers to report after processing all events

A server is considered drifted at time t if its latest snapshot at or before t differs from the baseline.

Define the drift score of a server as the number of keys where the latest config differs from baseline, counting:

Keys present in baseline but missing in server config (treated as drift).
Keys present in server config but not in baseline (treated as drift).
Keys present in both but with different values (treated as drift).

Task

Tie-breaking: If multiple servers have the same drift score, return the smaller server_id first.

Output

Return a list of server IDs of length min(k, number_of_seen_servers), sorted by:

Drift score descending
server_id ascending

Examples

Example 1

Input:

baseline = {"a":"1","b":"2"}
events = [(10, 1, {"a":"1","b":"9"}), (11, 2, {"a":"0"})]
k = 2

Output: [2, 1]

Explanation:

Server 1 differs on b only → score 1.
Server 2 differs on a (0 vs 1) and is missing b → score 2.

Example 2

Input:

baseline = {"x":"on"}
events = [(5, 7, {"x":"on","extra":"1"}), (3, 7, {"x":"off"})]
k = 1

Output: [7]

Explanation: Events are applied by timestamp: at t=3 config is {x:off} then at t=5 config is {x:on, extra:1}. Final drift is only the extra key → score 1.

Constraints

1 <= len(events) <= 2 * 10^5
1 <= server_id <= 5000
1 <= len(baseline) <= 10^4
Each config has up to 10^4 keys
Total number of key/value pairs across all events <= 2 * 10^5
1 <= k <= 5000

Notes / Clarifications

Only the latest snapshot per server matters in the final answer.
You should not compare every server to baseline on every event; design for scale.
Values are case-sensitive strings; keys are unique within a config.

Problem Narrative

Formal Problem Statement

You are given a stream of snapshot events. Each event is a tuple:

timestamp (int): seconds since epoch
server_id (int)
config (dict[str, str]): key/value configuration pairs

You are also given:

baseline (dict[str, str]): the desired configuration
k (int): the number of most drifted servers to report after processing all events

A server is considered drifted at time t if its latest snapshot at or before t differs from the baseline.

Define the drift score of a server as the number of keys where the latest config differs from baseline, counting:

Keys present in baseline but missing in server config (treated as drift).
Keys present in server config but not in baseline (treated as drift).
Keys present in both but with different values (treated as drift).

Task

Tie-breaking: If multiple servers have the same drift score, return the smaller server_id first.

Output

Return a list of server IDs of length min(k, number_of_seen_servers), sorted by:

Drift score descending
server_id ascending

Examples

Example 1

Input:

baseline = {"a":"1","b":"2"}
events = [(10, 1, {"a":"1","b":"9"}), (11, 2, {"a":"0"})]
k = 2

Output: [2, 1]

Explanation:

Server 1 differs on b only → score 1.
Server 2 differs on a (0 vs 1) and is missing b → score 2.

Example 2

Input:

baseline = {"x":"on"}
events = [(5, 7, {"x":"on","extra":"1"}), (3, 7, {"x":"off"})]
k = 1

Output: [7]

Explanation: Events are applied by timestamp: at t=3 config is {x:off} then at t=5 config is {x:on, extra:1}. Final drift is only the extra key → score 1.

Constraints

1 <= len(events) <= 2 * 10^5
1 <= server_id <= 5000
1 <= len(baseline) <= 10^4
Each config has up to 10^4 keys
Total number of key/value pairs across all events <= 2 * 10^5
1 <= k <= 5000

Notes / Clarifications

Only the latest snapshot per server matters in the final answer.
You should not compare every server to baseline on every event; design for scale.
Values are case-sensitive strings; keys are unique within a config.

Interview Guides

Problem Narrative

Formal Problem Statement

Task

Output

Examples

Example 1

Example 2

Constraints

Notes / Clarifications

Detect Configuration Drift Events

Problem Narrative

Formal Problem Statement

Task

Output

Examples

Example 1

Example 2

Constraints

Notes / Clarifications

Detect Configuration Drift Events

Problem Narrative

Formal Problem Statement

Task

Output

Examples

Example 1

Example 2

Constraints

Notes / Clarifications

Detect Configuration Drift Events

Problem Narrative

Formal Problem Statement

Task

Output

Examples

Example 1

Example 2

Constraints

Notes / Clarifications