PayWave processes roughly 8 million card transactions per day. Fraud losses are rising, and the risk team needs a binary classifier that identifies fraudulent transactions while keeping false positives low enough to avoid blocking legitimate customers.
You are given a historical transaction-level dataset for supervised fraud detection.
| Feature Group | Count | Examples |
|---|---|---|
| Transaction attributes | 12 | amount, merchant_category, card_present, channel, currency |
| Customer behavior | 9 | avg_txn_7d, txn_count_24h, device_count_30d, chargeback_rate_90d |
| Merchant risk | 6 | merchant_country, merchant_risk_score, prior_fraud_rate |
| Device / network | 5 | device_id_hash, ip_country, vpn_flag, velocity_score |
| Time features | 4 | hour_of_day, day_of_week, days_since_first_seen |
is_fraud (1 = fraudulent transaction, 0 = legitimate)A good solution should achieve strong minority-class detection without relying on accuracy. Target PR-AUC above 0.35, recall above 75% at precision above 20%, and provide a thresholding strategy the fraud operations team can tune based on review capacity.