Detect Payment Fraud from Messy History

Business Context

PayLink processes roughly 8 million card-not-present transactions per month for small merchants. The risk team wants a fraud detection model that can score transactions in near real time despite noisy labels, inconsistent schemas across historical sources, and substantial missing data.

Dataset

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, hour_of_day
Customer history	11	account_age_days, prior_chargebacks_90d, avg_ticket_30d, txn_count_7d
Device / network	9	device_type, ip_country, vpn_flag, browser_family, velocity_by_ip_1h
Merchant signals	8	merchant_risk_score, dispute_rate_30d, country, MCC
Data quality flags	6	missing_email_flag, inconsistent_country_flag, stale_profile_flag

Size: 2.4M historical transactions over 18 months, 48 candidate features
Target: Binary — confirmed fraudulent transaction within 45 days (1) vs legitimate (0)
Class balance: 1.3% positive, 98.7% negative
Missing data: 20% missing in device fields, 12% missing in customer history for new users, and inconsistent categorical values from legacy pipelines

Success Criteria

A good solution should achieve strong ranking quality and support an operational review queue. Target PR-AUC above 0.35, recall above 75% at precision above 20%, and lift above 6x in the top 1% scored transactions.

Constraints

Inference latency must stay under 50 ms per transaction.
The model must be explainable enough for analysts to review flagged payments.
Historical data is messy: duplicate rows, delayed fraud labels, and schema drift across months.
Retraining budget supports a weekly batch retrain, not continuous online learning.

Deliverables

Propose a full fraud modeling approach, including data cleaning and leakage prevention.
Build a binary classification pipeline for messy, imbalanced tabular data.
Define feature engineering steps that are realistic for real-time scoring.
Choose evaluation metrics and thresholding strategy for analyst review capacity.
Explain how you would validate, deploy, and monitor the model in production.

Business Context

Dataset

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, hour_of_day
Customer history	11	account_age_days, prior_chargebacks_90d, avg_ticket_30d, txn_count_7d
Device / network	9	device_type, ip_country, vpn_flag, browser_family, velocity_by_ip_1h
Merchant signals	8	merchant_risk_score, dispute_rate_30d, country, MCC
Data quality flags	6	missing_email_flag, inconsistent_country_flag, stale_profile_flag

Size: 2.4M historical transactions over 18 months, 48 candidate features
Target: Binary — confirmed fraudulent transaction within 45 days (1) vs legitimate (0)
Class balance: 1.3% positive, 98.7% negative
Missing data: 20% missing in device fields, 12% missing in customer history for new users, and inconsistent categorical values from legacy pipelines

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction.
The model must be explainable enough for analysts to review flagged payments.
Historical data is messy: duplicate rows, delayed fraud labels, and schema drift across months.
Retraining budget supports a weekly batch retrain, not continuous online learning.

Deliverables

Propose a full fraud modeling approach, including data cleaning and leakage prevention.
Build a binary classification pipeline for messy, imbalanced tabular data.
Define feature engineering steps that are realistic for real-time scoring.
Choose evaluation metrics and thresholding strategy for analyst review capacity.
Explain how you would validate, deploy, and monitor the model in production.

Business Context

Dataset

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, hour_of_day
Customer history	11	account_age_days, prior_chargebacks_90d, avg_ticket_30d, txn_count_7d
Device / network	9	device_type, ip_country, vpn_flag, browser_family, velocity_by_ip_1h
Merchant signals	8	merchant_risk_score, dispute_rate_30d, country, MCC
Data quality flags	6	missing_email_flag, inconsistent_country_flag, stale_profile_flag

Size: 2.4M historical transactions over 18 months, 48 candidate features
Target: Binary — confirmed fraudulent transaction within 45 days (1) vs legitimate (0)
Class balance: 1.3% positive, 98.7% negative
Missing data: 20% missing in device fields, 12% missing in customer history for new users, and inconsistent categorical values from legacy pipelines

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction.
The model must be explainable enough for analysts to review flagged payments.
Historical data is messy: duplicate rows, delayed fraud labels, and schema drift across months.
Retraining budget supports a weekly batch retrain, not continuous online learning.

Deliverables

Propose a full fraud modeling approach, including data cleaning and leakage prevention.
Build a binary classification pipeline for messy, imbalanced tabular data.
Define feature engineering steps that are realistic for real-time scoring.
Choose evaluation metrics and thresholding strategy for analyst review capacity.
Explain how you would validate, deploy, and monitor the model in production.

Business Context

Dataset

Feature Group	Count	Examples
Transaction attributes	14	amount, currency, merchant_category, payment_method, hour_of_day
Customer history	11	account_age_days, prior_chargebacks_90d, avg_ticket_30d, txn_count_7d
Device / network	9	device_type, ip_country, vpn_flag, browser_family, velocity_by_ip_1h
Merchant signals	8	merchant_risk_score, dispute_rate_30d, country, MCC
Data quality flags	6	missing_email_flag, inconsistent_country_flag, stale_profile_flag

Size: 2.4M historical transactions over 18 months, 48 candidate features
Target: Binary — confirmed fraudulent transaction within 45 days (1) vs legitimate (0)
Class balance: 1.3% positive, 98.7% negative
Missing data: 20% missing in device fields, 12% missing in customer history for new users, and inconsistent categorical values from legacy pipelines

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction.
The model must be explainable enough for analysts to review flagged payments.
Historical data is messy: duplicate rows, delayed fraud labels, and schema drift across months.
Retraining budget supports a weekly batch retrain, not continuous online learning.

Deliverables

Propose a full fraud modeling approach, including data cleaning and leakage prevention.
Build a binary classification pipeline for messy, imbalanced tabular data.
Define feature engineering steps that are realistic for real-time scoring.
Choose evaluation metrics and thresholding strategy for analyst review capacity.
Explain how you would validate, deploy, and monitor the model in production.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Payment Fraud from Messy History

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Detect Payment Fraud from Messy History

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Detect Payment Fraud from Messy History

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer