Replace Fraud Rules with ML

Business Context

PayFlow, a mid-size digital payments company, currently blocks suspicious card transactions using hard-coded rules such as amount > 1000 and country != billing_country. These rules are easy to explain but miss new fraud patterns and require constant manual updates. You need to design an ML-based fraud classifier and explain how it differs from a rule-based system in behavior, maintenance, and performance.

Dataset

You are given historical transaction-level data labeled after chargeback resolution.

Feature Group	Count	Examples
Transaction features	8	amount, currency, merchant_category, payment_method
User behavior	7	txn_count_24h, avg_amount_30d, account_age_days, failed_attempts_7d
Device / network	6	device_type, ip_country, is_proxy, browser_family
Rule outputs	4	rule_high_amount, rule_geo_mismatch, rule_velocity, rule_new_device
Target	1	is_fraud

Size: 420K transactions over 9 months, 25 input features
Target: Binary label indicating whether the transaction was confirmed fraudulent
Class balance: 2.9% fraud, 97.1% legitimate
Missing data: 12% missing in device fingerprint fields, 4% missing in historical behavior for new users

Success Criteria

A good solution should outperform the current rule engine on fraud recall while keeping analyst review volume manageable. Target at least 75% recall with precision above 20% and provide a clear comparison between static rules and a trained model.

Constraints

Inference latency must stay under 50 ms per transaction
Risk team needs feature importance and reason codes for flagged transactions
The model must be retrained at least monthly because fraud patterns drift

Deliverables

Build a baseline hard-coded rule system and report its metrics.
Train an ML classification model on the labeled dataset.
Compare rule-based vs ML approaches in terms of adaptability, maintenance, and error patterns.
Propose preprocessing and feature engineering for mixed data and missing values.
Recommend a deployment threshold based on business tradeoffs between fraud loss and false positives.

Business Context

Dataset

You are given historical transaction-level data labeled after chargeback resolution.

Feature Group	Count	Examples
Transaction features	8	amount, currency, merchant_category, payment_method
User behavior	7	txn_count_24h, avg_amount_30d, account_age_days, failed_attempts_7d
Device / network	6	device_type, ip_country, is_proxy, browser_family
Rule outputs	4	rule_high_amount, rule_geo_mismatch, rule_velocity, rule_new_device
Target	1	is_fraud

Size: 420K transactions over 9 months, 25 input features
Target: Binary label indicating whether the transaction was confirmed fraudulent
Class balance: 2.9% fraud, 97.1% legitimate
Missing data: 12% missing in device fingerprint fields, 4% missing in historical behavior for new users

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction
Risk team needs feature importance and reason codes for flagged transactions
The model must be retrained at least monthly because fraud patterns drift

Deliverables

Build a baseline hard-coded rule system and report its metrics.
Train an ML classification model on the labeled dataset.
Compare rule-based vs ML approaches in terms of adaptability, maintenance, and error patterns.
Propose preprocessing and feature engineering for mixed data and missing values.
Recommend a deployment threshold based on business tradeoffs between fraud loss and false positives.

Business Context

Dataset

You are given historical transaction-level data labeled after chargeback resolution.

Feature Group	Count	Examples
Transaction features	8	amount, currency, merchant_category, payment_method
User behavior	7	txn_count_24h, avg_amount_30d, account_age_days, failed_attempts_7d
Device / network	6	device_type, ip_country, is_proxy, browser_family
Rule outputs	4	rule_high_amount, rule_geo_mismatch, rule_velocity, rule_new_device
Target	1	is_fraud

Size: 420K transactions over 9 months, 25 input features
Target: Binary label indicating whether the transaction was confirmed fraudulent
Class balance: 2.9% fraud, 97.1% legitimate
Missing data: 12% missing in device fingerprint fields, 4% missing in historical behavior for new users

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction
Risk team needs feature importance and reason codes for flagged transactions
The model must be retrained at least monthly because fraud patterns drift

Deliverables

Build a baseline hard-coded rule system and report its metrics.
Train an ML classification model on the labeled dataset.
Compare rule-based vs ML approaches in terms of adaptability, maintenance, and error patterns.
Propose preprocessing and feature engineering for mixed data and missing values.
Recommend a deployment threshold based on business tradeoffs between fraud loss and false positives.

Business Context

Dataset

You are given historical transaction-level data labeled after chargeback resolution.

Feature Group	Count	Examples
Transaction features	8	amount, currency, merchant_category, payment_method
User behavior	7	txn_count_24h, avg_amount_30d, account_age_days, failed_attempts_7d
Device / network	6	device_type, ip_country, is_proxy, browser_family
Rule outputs	4	rule_high_amount, rule_geo_mismatch, rule_velocity, rule_new_device
Target	1	is_fraud

Size: 420K transactions over 9 months, 25 input features
Target: Binary label indicating whether the transaction was confirmed fraudulent
Class balance: 2.9% fraud, 97.1% legitimate
Missing data: 12% missing in device fingerprint fields, 4% missing in historical behavior for new users

Success Criteria

Constraints

Inference latency must stay under 50 ms per transaction
Risk team needs feature importance and reason codes for flagged transactions
The model must be retrained at least monthly because fraud patterns drift

Deliverables

Build a baseline hard-coded rule system and report its metrics.
Train an ML classification model on the labeled dataset.
Compare rule-based vs ML approaches in terms of adaptability, maintenance, and error patterns.
Propose preprocessing and feature engineering for mixed data and missing values.
Recommend a deployment threshold based on business tradeoffs between fraud loss and false positives.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Replace Fraud Rules with ML

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Replace Fraud Rules with ML

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Replace Fraud Rules with ML

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer