Business Context
PayFlow processes roughly 12 million card transactions per day across web and mobile checkout. Fraud losses are rising, and the risk team needs a model that flags suspicious transactions in near real time without overwhelming manual reviewers or blocking too many legitimate payments.
Dataset
You are given a historical transaction dataset built from the last 9 months of activity.
| Feature Group | Count | Examples |
|---|
| Transaction attributes | 12 | amount, merchant_category, payment_method, currency, hour_of_day |
| Customer behavior | 10 | transactions_24h, avg_amount_30d, chargebacks_90d, device_count_7d |
| Device / network | 8 | device_id_hash, ip_country, vpn_flag, browser_family |
| Merchant signals | 6 | merchant_risk_score, refund_rate_30d, dispute_rate_90d |
| Derived velocity features | 9 | amount_zscore_user, cards_per_device_24h, failed_attempts_1h |
- Rows: 4.8M transactions, 45 features
- Target:
is_fraud (1 = confirmed fraud/chargeback, 0 = legitimate)
- Class balance: 0.42% fraud, 99.58% non-fraud
- Missing data: 18% missing in merchant risk features for new merchants, 6% missing in device attributes, sparse high-cardinality categoricals
Success Criteria
A strong solution should:
- achieve recall >= 75% on fraud cases,
- maintain precision >= 12% at the operating threshold,
- deliver PR-AUC >= 0.30 on the held-out test set,
- score each transaction in under 50 ms p95 for online inference.
Constraints
- False positives directly impact checkout conversion and customer trust.
- The fraud team can manually review at most 8,000 alerts/day.
- The solution must be explainable enough to support analyst review and model governance.
- Training can run offline daily; inference must support real-time API scoring.
Deliverables
- Propose a modeling approach for severe class imbalance.
- Build a training pipeline with preprocessing, feature handling, and threshold selection.
- Justify the evaluation strategy and why accuracy is not appropriate.
- Show how you would tune for recall/precision tradeoffs under review-capacity constraints.
- Describe how the model would be deployed, monitored, and retrained in production.