Deduplicate Fintech Transaction Logs

Problem

You’re on the fraud and reliability team at a high-volume fintech processor that ingests tens of millions of card transactions per day. Due to retries, network partitions, and at-least-once delivery, the same transaction log can be recorded multiple times. Duplicates inflate revenue reporting, trigger false fraud alerts, and create audit issues.

You are given a list of transaction log entries. Each entry is a dictionary with:

tx_id (string): globally unique transaction identifier
timestamp (int): Unix epoch seconds when the log was written
amount_cents (int): transaction amount in cents

A duplicate is defined as an entry with the same tx_id as another entry. When duplicates exist for the same tx_id, you must keep exactly one entry using the following rules:

Keep the entry with the earliest timestamp.
If there is still a tie, keep the entry with the smaller amount_cents.
If there is still a tie, keep the entry that appears earlier in the input list.

Return the deduplicated logs as a list of entries sorted by timestamp ascending, and for ties by tx_id ascending.

Function Task

Implement dedupe_transaction_logs(logs).

Examples

Example 1

Input:
- logs = [ {"tx_id":"A", "timestamp":100, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700}, {"tx_id":"A", "timestamp": 99, "amount_cents":500} ]
Output:
- [ {"tx_id":"A", "timestamp": 99, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700} ]
Explanation: tx_id="A" appears twice; keep the earliest timestamp (99). Then sort by timestamp.

Example 2

Input:
- logs = [ {"tx_id":"X", "timestamp":200, "amount_cents":1000}, {"tx_id":"X", "timestamp":200, "amount_cents": 999}, {"tx_id":"Y", "timestamp":199, "amount_cents": 500} ]
Output:
- [ {"tx_id":"Y", "timestamp":199, "amount_cents":500}, {"tx_id":"X", "timestamp":200, "amount_cents": 999} ]
Explanation: For X, timestamps tie, so keep smaller amount (999).

Notes

You may return new dictionaries or the original ones; correctness is based on field values.
Assume all entries have the required keys.

Constraints

1 <= logs.length <= 2 * 10^5
1 <= len(tx_id) <= 64
0 <= timestamp <= 2 * 10^9
-10^9 <= amount_cents <= 10^9

Problem

You are given a list of transaction log entries. Each entry is a dictionary with:

tx_id (string): globally unique transaction identifier
timestamp (int): Unix epoch seconds when the log was written
amount_cents (int): transaction amount in cents

A duplicate is defined as an entry with the same tx_id as another entry. When duplicates exist for the same tx_id, you must keep exactly one entry using the following rules:

Keep the entry with the earliest timestamp.
If there is still a tie, keep the entry with the smaller amount_cents.
If there is still a tie, keep the entry that appears earlier in the input list.

Return the deduplicated logs as a list of entries sorted by timestamp ascending, and for ties by tx_id ascending.

Function Task

Implement dedupe_transaction_logs(logs).

Examples

Example 1

Input:
- logs = [ {"tx_id":"A", "timestamp":100, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700}, {"tx_id":"A", "timestamp": 99, "amount_cents":500} ]
Output:
- [ {"tx_id":"A", "timestamp": 99, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700} ]
Explanation: tx_id="A" appears twice; keep the earliest timestamp (99). Then sort by timestamp.

Example 2

Input:
- logs = [ {"tx_id":"X", "timestamp":200, "amount_cents":1000}, {"tx_id":"X", "timestamp":200, "amount_cents": 999}, {"tx_id":"Y", "timestamp":199, "amount_cents": 500} ]
Output:
- [ {"tx_id":"Y", "timestamp":199, "amount_cents":500}, {"tx_id":"X", "timestamp":200, "amount_cents": 999} ]
Explanation: For X, timestamps tie, so keep smaller amount (999).

Notes

You may return new dictionaries or the original ones; correctness is based on field values.
Assume all entries have the required keys.

Constraints

1 <= logs.length <= 2 * 10^5
1 <= len(tx_id) <= 64
0 <= timestamp <= 2 * 10^9
-10^9 <= amount_cents <= 10^9

Problem

You are given a list of transaction log entries. Each entry is a dictionary with:

tx_id (string): globally unique transaction identifier
timestamp (int): Unix epoch seconds when the log was written
amount_cents (int): transaction amount in cents

A duplicate is defined as an entry with the same tx_id as another entry. When duplicates exist for the same tx_id, you must keep exactly one entry using the following rules:

Keep the entry with the earliest timestamp.
If there is still a tie, keep the entry with the smaller amount_cents.
If there is still a tie, keep the entry that appears earlier in the input list.

Return the deduplicated logs as a list of entries sorted by timestamp ascending, and for ties by tx_id ascending.

Function Task

Implement dedupe_transaction_logs(logs).

Examples

Example 1

Input:
- logs = [ {"tx_id":"A", "timestamp":100, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700}, {"tx_id":"A", "timestamp": 99, "amount_cents":500} ]
Output:
- [ {"tx_id":"A", "timestamp": 99, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700} ]
Explanation: tx_id="A" appears twice; keep the earliest timestamp (99). Then sort by timestamp.

Example 2

Input:
- logs = [ {"tx_id":"X", "timestamp":200, "amount_cents":1000}, {"tx_id":"X", "timestamp":200, "amount_cents": 999}, {"tx_id":"Y", "timestamp":199, "amount_cents": 500} ]
Output:
- [ {"tx_id":"Y", "timestamp":199, "amount_cents":500}, {"tx_id":"X", "timestamp":200, "amount_cents": 999} ]
Explanation: For X, timestamps tie, so keep smaller amount (999).

Notes

You may return new dictionaries or the original ones; correctness is based on field values.
Assume all entries have the required keys.

Constraints

1 <= logs.length <= 2 * 10^5
1 <= len(tx_id) <= 64
0 <= timestamp <= 2 * 10^9
-10^9 <= amount_cents <= 10^9

Problem

You are given a list of transaction log entries. Each entry is a dictionary with:

tx_id (string): globally unique transaction identifier
timestamp (int): Unix epoch seconds when the log was written
amount_cents (int): transaction amount in cents

A duplicate is defined as an entry with the same tx_id as another entry. When duplicates exist for the same tx_id, you must keep exactly one entry using the following rules:

Keep the entry with the earliest timestamp.
If there is still a tie, keep the entry with the smaller amount_cents.
If there is still a tie, keep the entry that appears earlier in the input list.

Return the deduplicated logs as a list of entries sorted by timestamp ascending, and for ties by tx_id ascending.

Function Task

Implement dedupe_transaction_logs(logs).

Examples

Example 1

Input:
- logs = [ {"tx_id":"A", "timestamp":100, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700}, {"tx_id":"A", "timestamp": 99, "amount_cents":500} ]
Output:
- [ {"tx_id":"A", "timestamp": 99, "amount_cents":500}, {"tx_id":"B", "timestamp":101, "amount_cents":700} ]
Explanation: tx_id="A" appears twice; keep the earliest timestamp (99). Then sort by timestamp.

Example 2

Input:
- logs = [ {"tx_id":"X", "timestamp":200, "amount_cents":1000}, {"tx_id":"X", "timestamp":200, "amount_cents": 999}, {"tx_id":"Y", "timestamp":199, "amount_cents": 500} ]
Output:
- [ {"tx_id":"Y", "timestamp":199, "amount_cents":500}, {"tx_id":"X", "timestamp":200, "amount_cents": 999} ]
Explanation: For X, timestamps tie, so keep smaller amount (999).

Notes

You may return new dictionaries or the original ones; correctness is based on field values.
Assume all entries have the required keys.

Constraints

1 <= logs.length <= 2 * 10^5
1 <= len(tx_id) <= 64
0 <= timestamp <= 2 * 10^9
-10^9 <= amount_cents <= 10^9

Interview Guides

Problem

Function Task

Examples

Notes

Constraints

Deduplicate Fintech Transaction Logs

Problem

Function Task

Examples

Notes

Constraints

Deduplicate Fintech Transaction Logs

Problem

Function Task

Examples

Notes

Constraints

Deduplicate Fintech Transaction Logs

Problem

Function Task

Examples

Notes

Constraints