Handle Payment Fraud Class Imbalance

Business Context

Microsoft Store processes millions of payment attempts each month across cards, wallets, gift balances, and digital subscriptions. Fraud losses are material, but excessive false positives also block legitimate customers and create support costs, so the fraud model must perform well on a highly imbalanced dataset.

Dataset

You are given a historical transaction dataset used for post-authorization fraud labeling.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, payment_method, merchant_category, is_digital_good
Customer behavior	11	account_age_days, prior_chargebacks_90d, failed_logins_7d, avg_order_value_30d
Device and network	9	device_id_hash, browser_family, IP_country, ASN_risk_score
Velocity features	8	txns_last_10m, cards_per_device_24h, amount_sum_1h, distinct_accounts_per_ip_24h
Risk signals	6	AVS_result, CVV_result, 3DS_used, email_domain_risk, geodistance_km

Size: 4.8M transactions over 9 months, 48 modeled features
Target: Binary fraud label confirmed within 45 days of transaction settlement
Class balance: 0.42% fraud, 99.58% non-fraud
Missing data: 18% missing in AVS/CVV-related fields, 7% missing in device attributes, and sparse values for new users

Success Criteria

A strong solution should:

achieve recall >= 75% on fraudulent transactions,
maintain precision >= 18% at the operating threshold,
deliver PR-AUC >= 0.30, and
produce a ranked fraud score usable in Microsoft Azure batch scoring and near-real-time review queues.

Constraints

Inference latency must stay under 50 ms p95 per transaction in Azure Machine Learning online endpoints.
The fraud operations team needs feature-level explanations for manual review.
Labels arrive with delay, so validation must avoid temporal leakage.
The review queue can only inspect the top 1.5% of scored transactions.

Deliverables

Propose a modeling strategy for severe class imbalance in fraud detection.
Explain how you would split data, engineer features, and avoid leakage.
Train and evaluate a production-ready classifier with threshold tuning.
Show how you would measure business impact using ranking and classification metrics.
Describe deployment and monitoring considerations in Azure Machine Learning.

Business Context

Dataset

You are given a historical transaction dataset used for post-authorization fraud labeling.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, payment_method, merchant_category, is_digital_good
Customer behavior	11	account_age_days, prior_chargebacks_90d, failed_logins_7d, avg_order_value_30d
Device and network	9	device_id_hash, browser_family, IP_country, ASN_risk_score
Velocity features	8	txns_last_10m, cards_per_device_24h, amount_sum_1h, distinct_accounts_per_ip_24h
Risk signals	6	AVS_result, CVV_result, 3DS_used, email_domain_risk, geodistance_km

Size: 4.8M transactions over 9 months, 48 modeled features
Target: Binary fraud label confirmed within 45 days of transaction settlement
Class balance: 0.42% fraud, 99.58% non-fraud
Missing data: 18% missing in AVS/CVV-related fields, 7% missing in device attributes, and sparse values for new users

Success Criteria

A strong solution should:

achieve recall >= 75% on fraudulent transactions,
maintain precision >= 18% at the operating threshold,
deliver PR-AUC >= 0.30, and
produce a ranked fraud score usable in Microsoft Azure batch scoring and near-real-time review queues.

Constraints

Inference latency must stay under 50 ms p95 per transaction in Azure Machine Learning online endpoints.
The fraud operations team needs feature-level explanations for manual review.
Labels arrive with delay, so validation must avoid temporal leakage.
The review queue can only inspect the top 1.5% of scored transactions.

Deliverables

Propose a modeling strategy for severe class imbalance in fraud detection.
Explain how you would split data, engineer features, and avoid leakage.
Train and evaluate a production-ready classifier with threshold tuning.
Show how you would measure business impact using ranking and classification metrics.
Describe deployment and monitoring considerations in Azure Machine Learning.

Business Context

Dataset

You are given a historical transaction dataset used for post-authorization fraud labeling.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, payment_method, merchant_category, is_digital_good
Customer behavior	11	account_age_days, prior_chargebacks_90d, failed_logins_7d, avg_order_value_30d
Device and network	9	device_id_hash, browser_family, IP_country, ASN_risk_score
Velocity features	8	txns_last_10m, cards_per_device_24h, amount_sum_1h, distinct_accounts_per_ip_24h
Risk signals	6	AVS_result, CVV_result, 3DS_used, email_domain_risk, geodistance_km

Size: 4.8M transactions over 9 months, 48 modeled features
Target: Binary fraud label confirmed within 45 days of transaction settlement
Class balance: 0.42% fraud, 99.58% non-fraud
Missing data: 18% missing in AVS/CVV-related fields, 7% missing in device attributes, and sparse values for new users

Success Criteria

A strong solution should:

achieve recall >= 75% on fraudulent transactions,
maintain precision >= 18% at the operating threshold,
deliver PR-AUC >= 0.30, and
produce a ranked fraud score usable in Microsoft Azure batch scoring and near-real-time review queues.

Constraints

Inference latency must stay under 50 ms p95 per transaction in Azure Machine Learning online endpoints.
The fraud operations team needs feature-level explanations for manual review.
Labels arrive with delay, so validation must avoid temporal leakage.
The review queue can only inspect the top 1.5% of scored transactions.

Deliverables

Propose a modeling strategy for severe class imbalance in fraud detection.
Explain how you would split data, engineer features, and avoid leakage.
Train and evaluate a production-ready classifier with threshold tuning.
Show how you would measure business impact using ranking and classification metrics.
Describe deployment and monitoring considerations in Azure Machine Learning.

Business Context

Dataset

You are given a historical transaction dataset used for post-authorization fraud labeling.

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, payment_method, merchant_category, is_digital_good
Customer behavior	11	account_age_days, prior_chargebacks_90d, failed_logins_7d, avg_order_value_30d
Device and network	9	device_id_hash, browser_family, IP_country, ASN_risk_score
Velocity features	8	txns_last_10m, cards_per_device_24h, amount_sum_1h, distinct_accounts_per_ip_24h
Risk signals	6	AVS_result, CVV_result, 3DS_used, email_domain_risk, geodistance_km

Size: 4.8M transactions over 9 months, 48 modeled features
Target: Binary fraud label confirmed within 45 days of transaction settlement
Class balance: 0.42% fraud, 99.58% non-fraud
Missing data: 18% missing in AVS/CVV-related fields, 7% missing in device attributes, and sparse values for new users

Success Criteria

A strong solution should:

achieve recall >= 75% on fraudulent transactions,
maintain precision >= 18% at the operating threshold,
deliver PR-AUC >= 0.30, and
produce a ranked fraud score usable in Microsoft Azure batch scoring and near-real-time review queues.

Constraints

Inference latency must stay under 50 ms p95 per transaction in Azure Machine Learning online endpoints.
The fraud operations team needs feature-level explanations for manual review.
Labels arrive with delay, so validation must avoid temporal leakage.
The review queue can only inspect the top 1.5% of scored transactions.

Deliverables

Propose a modeling strategy for severe class imbalance in fraud detection.
Explain how you would split data, engineer features, and avoid leakage.
Train and evaluate a production-ready classifier with threshold tuning.
Show how you would measure business impact using ranking and classification metrics.
Describe deployment and monitoring considerations in Azure Machine Learning.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Handle Payment Fraud Class Imbalance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Handle Payment Fraud Class Imbalance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Handle Payment Fraud Class Imbalance

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer