Detect Rare Payment Fraud Events

Business Context

PayLink processes roughly 12 million card transactions per day for mid-market e-commerce merchants. The fraud operations team needs a model that identifies extremely rare fraudulent transactions in near real time, where the positive class represents only 0.1% of all labeled examples.

Dataset

You are given a historical transaction dataset for supervised binary classification.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, device_type
User behavior	11	transactions_1h, avg_amount_7d, failed_attempts_24h, account_age_days
Risk signals	9	ip_country_mismatch, velocity_score, email_domain_risk, prior_chargebacks
Temporal/context	8	hour_of_day, day_of_week, holiday_flag, merchant_region

Size: 8.4M transactions, 42 engineered and raw features
Target: is_fraud (1 = confirmed fraud, 0 = legitimate)
Class balance: 0.1% positive, 99.9% negative
Missing data: 6% missing in device fingerprint fields, 18% missing in historical behavior features for new users

Success Criteria

A good solution should:

achieve recall e 75% on fraudulent transactions,
maintain precision e 10% at the operating threshold,
improve analyst efficiency with lift > 20x in the top 0.5% scored transactions.

Constraints

Online inference latency must stay under 50 ms per transaction.
The fraud team needs feature-level explanations for flagged transactions.
False positives are costly because they block legitimate payments.
Training can run daily; scoring must support real-time serving.

Deliverables

Propose a modeling approach for extreme class imbalance (0.1% positive rate).
Describe preprocessing, feature engineering, and leakage prevention.
Train and evaluate a baseline and a stronger production candidate.
Choose decision thresholds based on business tradeoffs, not accuracy.
Explain how you would monitor precision, recall, drift, and calibration after deployment.

Business Context

Dataset

You are given a historical transaction dataset for supervised binary classification.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, device_type
User behavior	11	transactions_1h, avg_amount_7d, failed_attempts_24h, account_age_days
Risk signals	9	ip_country_mismatch, velocity_score, email_domain_risk, prior_chargebacks
Temporal/context	8	hour_of_day, day_of_week, holiday_flag, merchant_region

Size: 8.4M transactions, 42 engineered and raw features
Target: is_fraud (1 = confirmed fraud, 0 = legitimate)
Class balance: 0.1% positive, 99.9% negative
Missing data: 6% missing in device fingerprint fields, 18% missing in historical behavior features for new users

Success Criteria

A good solution should:

achieve recall e 75% on fraudulent transactions,
maintain precision e 10% at the operating threshold,
improve analyst efficiency with lift > 20x in the top 0.5% scored transactions.

Constraints

Online inference latency must stay under 50 ms per transaction.
The fraud team needs feature-level explanations for flagged transactions.
False positives are costly because they block legitimate payments.
Training can run daily; scoring must support real-time serving.

Deliverables

Propose a modeling approach for extreme class imbalance (0.1% positive rate).
Describe preprocessing, feature engineering, and leakage prevention.
Train and evaluate a baseline and a stronger production candidate.
Choose decision thresholds based on business tradeoffs, not accuracy.
Explain how you would monitor precision, recall, drift, and calibration after deployment.

Business Context

Dataset

You are given a historical transaction dataset for supervised binary classification.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, device_type
User behavior	11	transactions_1h, avg_amount_7d, failed_attempts_24h, account_age_days
Risk signals	9	ip_country_mismatch, velocity_score, email_domain_risk, prior_chargebacks
Temporal/context	8	hour_of_day, day_of_week, holiday_flag, merchant_region

Size: 8.4M transactions, 42 engineered and raw features
Target: is_fraud (1 = confirmed fraud, 0 = legitimate)
Class balance: 0.1% positive, 99.9% negative
Missing data: 6% missing in device fingerprint fields, 18% missing in historical behavior features for new users

Success Criteria

A good solution should:

achieve recall e 75% on fraudulent transactions,
maintain precision e 10% at the operating threshold,
improve analyst efficiency with lift > 20x in the top 0.5% scored transactions.

Constraints

Online inference latency must stay under 50 ms per transaction.
The fraud team needs feature-level explanations for flagged transactions.
False positives are costly because they block legitimate payments.
Training can run daily; scoring must support real-time serving.

Deliverables

Propose a modeling approach for extreme class imbalance (0.1% positive rate).
Describe preprocessing, feature engineering, and leakage prevention.
Train and evaluate a baseline and a stronger production candidate.
Choose decision thresholds based on business tradeoffs, not accuracy.
Explain how you would monitor precision, recall, drift, and calibration after deployment.

Business Context

Dataset

You are given a historical transaction dataset for supervised binary classification.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, device_type
User behavior	11	transactions_1h, avg_amount_7d, failed_attempts_24h, account_age_days
Risk signals	9	ip_country_mismatch, velocity_score, email_domain_risk, prior_chargebacks
Temporal/context	8	hour_of_day, day_of_week, holiday_flag, merchant_region

Size: 8.4M transactions, 42 engineered and raw features
Target: is_fraud (1 = confirmed fraud, 0 = legitimate)
Class balance: 0.1% positive, 99.9% negative
Missing data: 6% missing in device fingerprint fields, 18% missing in historical behavior features for new users

Success Criteria

A good solution should:

achieve recall e 75% on fraudulent transactions,
maintain precision e 10% at the operating threshold,
improve analyst efficiency with lift > 20x in the top 0.5% scored transactions.

Constraints

Online inference latency must stay under 50 ms per transaction.
The fraud team needs feature-level explanations for flagged transactions.
False positives are costly because they block legitimate payments.
Training can run daily; scoring must support real-time serving.

Deliverables

Propose a modeling approach for extreme class imbalance (0.1% positive rate).
Describe preprocessing, feature engineering, and leakage prevention.
Train and evaluate a baseline and a stronger production candidate.
Choose decision thresholds based on business tradeoffs, not accuracy.
Explain how you would monitor precision, recall, drift, and calibration after deployment.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Rare Payment Fraud Events

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Detect Rare Payment Fraud Events

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Rare Payment Fraud Events

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer