Detect Card Fraud with Imbalanced Data

Business Context

PayWave processes roughly 8 million card transactions per day. Fraud is rare but costly, and the risk team needs a model that identifies suspicious transactions for manual review without overwhelming analysts with false positives.

Dataset

You are given a historical transaction dataset for binary classification.

Feature Group	Count	Examples
Transaction attributes	12	amount, merchant_category, entry_mode, device_type
Customer behavior	9	avg_txn_amount_7d, txn_count_24h, chargeback_count_90d
Merchant signals	6	merchant_risk_score, country_match, prior_fraud_rate
Temporal/location	5	hour_of_day, day_of_week, geo_distance_from_home
Derived features	8	amount_vs_customer_avg, rapid_repeat_flag, cross_border_flag

Size: 1.2M transactions, 40 features
Target: is_fraud — 1 if the transaction was confirmed fraudulent within 30 days, else 0
Class balance: 0.9% positive, 99.1% negative
Missing data: ~12% missing in merchant risk fields for new merchants; ~4% missing in device attributes

Success Criteria

A good solution should improve minority-class detection while keeping review volume manageable. Target at least 70% recall on fraud cases with precision >= 15% on the flagged set, and clearly justify threshold selection. Accuracy alone is not acceptable.

Constraints

Batch scoring every 5 minutes; per-transaction inference should remain under 50 ms
Model output must support analyst review with explainable top drivers
False negatives are expensive, but false positives increase manual review cost
Training pipeline should be reproducible and avoid leakage from future information

Deliverables

Build a baseline and an improved classifier for this imbalanced dataset
Explain how you handle class imbalance in training and evaluation
Design preprocessing for missing values, categorical variables, and skewed numeric features
Choose decision threshold(s) based on business tradeoffs, not default 0.5
Report evaluation on a held-out test set using appropriate metrics
Summarize what you would deploy and how you would monitor it in production

Business Context

Dataset

You are given a historical transaction dataset for binary classification.

Feature Group	Count	Examples
Transaction attributes	12	amount, merchant_category, entry_mode, device_type
Customer behavior	9	avg_txn_amount_7d, txn_count_24h, chargeback_count_90d
Merchant signals	6	merchant_risk_score, country_match, prior_fraud_rate
Temporal/location	5	hour_of_day, day_of_week, geo_distance_from_home
Derived features	8	amount_vs_customer_avg, rapid_repeat_flag, cross_border_flag

Size: 1.2M transactions, 40 features
Target: is_fraud — 1 if the transaction was confirmed fraudulent within 30 days, else 0
Class balance: 0.9% positive, 99.1% negative
Missing data: ~12% missing in merchant risk fields for new merchants; ~4% missing in device attributes

Success Criteria

Constraints

Batch scoring every 5 minutes; per-transaction inference should remain under 50 ms
Model output must support analyst review with explainable top drivers
False negatives are expensive, but false positives increase manual review cost
Training pipeline should be reproducible and avoid leakage from future information

Deliverables

Build a baseline and an improved classifier for this imbalanced dataset
Explain how you handle class imbalance in training and evaluation
Design preprocessing for missing values, categorical variables, and skewed numeric features
Choose decision threshold(s) based on business tradeoffs, not default 0.5
Report evaluation on a held-out test set using appropriate metrics
Summarize what you would deploy and how you would monitor it in production

Business Context

Dataset

You are given a historical transaction dataset for binary classification.

Feature Group	Count	Examples
Transaction attributes	12	amount, merchant_category, entry_mode, device_type
Customer behavior	9	avg_txn_amount_7d, txn_count_24h, chargeback_count_90d
Merchant signals	6	merchant_risk_score, country_match, prior_fraud_rate
Temporal/location	5	hour_of_day, day_of_week, geo_distance_from_home
Derived features	8	amount_vs_customer_avg, rapid_repeat_flag, cross_border_flag

Size: 1.2M transactions, 40 features
Target: is_fraud — 1 if the transaction was confirmed fraudulent within 30 days, else 0
Class balance: 0.9% positive, 99.1% negative
Missing data: ~12% missing in merchant risk fields for new merchants; ~4% missing in device attributes

Success Criteria

Constraints

Batch scoring every 5 minutes; per-transaction inference should remain under 50 ms
Model output must support analyst review with explainable top drivers
False negatives are expensive, but false positives increase manual review cost
Training pipeline should be reproducible and avoid leakage from future information

Deliverables

Build a baseline and an improved classifier for this imbalanced dataset
Explain how you handle class imbalance in training and evaluation
Design preprocessing for missing values, categorical variables, and skewed numeric features
Choose decision threshold(s) based on business tradeoffs, not default 0.5
Report evaluation on a held-out test set using appropriate metrics
Summarize what you would deploy and how you would monitor it in production

Business Context

Dataset

You are given a historical transaction dataset for binary classification.

Feature Group	Count	Examples
Transaction attributes	12	amount, merchant_category, entry_mode, device_type
Customer behavior	9	avg_txn_amount_7d, txn_count_24h, chargeback_count_90d
Merchant signals	6	merchant_risk_score, country_match, prior_fraud_rate
Temporal/location	5	hour_of_day, day_of_week, geo_distance_from_home
Derived features	8	amount_vs_customer_avg, rapid_repeat_flag, cross_border_flag

Size: 1.2M transactions, 40 features
Target: is_fraud — 1 if the transaction was confirmed fraudulent within 30 days, else 0
Class balance: 0.9% positive, 99.1% negative
Missing data: ~12% missing in merchant risk fields for new merchants; ~4% missing in device attributes

Success Criteria

Constraints

Batch scoring every 5 minutes; per-transaction inference should remain under 50 ms
Model output must support analyst review with explainable top drivers
False negatives are expensive, but false positives increase manual review cost
Training pipeline should be reproducible and avoid leakage from future information

Deliverables

Build a baseline and an improved classifier for this imbalanced dataset
Explain how you handle class imbalance in training and evaluation
Design preprocessing for missing values, categorical variables, and skewed numeric features
Choose decision threshold(s) based on business tradeoffs, not default 0.5
Report evaluation on a held-out test set using appropriate metrics
Summarize what you would deploy and how you would monitor it in production

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Card Fraud with Imbalanced Data

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Detect Card Fraud with Imbalanced Data

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Card Fraud with Imbalanced Data

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer