Business Context
PayLink processes roughly 8 million card-not-present transactions per month for small merchants. The risk team wants a fraud detection model that can score transactions in near real time despite noisy labels, inconsistent schemas across historical sources, and substantial missing data.
Dataset
| Feature Group | Count | Examples |
|---|
| Transaction attributes | 14 | amount, currency, merchant_category, payment_method, hour_of_day |
| Customer history | 11 | account_age_days, prior_chargebacks_90d, avg_ticket_30d, txn_count_7d |
| Device / network | 9 | device_type, ip_country, vpn_flag, browser_family, velocity_by_ip_1h |
| Merchant signals | 8 | merchant_risk_score, dispute_rate_30d, country, MCC |
| Data quality flags | 6 | missing_email_flag, inconsistent_country_flag, stale_profile_flag |
- Size: 2.4M historical transactions over 18 months, 48 candidate features
- Target: Binary — confirmed fraudulent transaction within 45 days (1) vs legitimate (0)
- Class balance: 1.3% positive, 98.7% negative
- Missing data: 20% missing in device fields, 12% missing in customer history for new users, and inconsistent categorical values from legacy pipelines
Success Criteria
A good solution should achieve strong ranking quality and support an operational review queue. Target PR-AUC above 0.35, recall above 75% at precision above 20%, and lift above 6x in the top 1% scored transactions.
Constraints
- Inference latency must stay under 50 ms per transaction.
- The model must be explainable enough for analysts to review flagged payments.
- Historical data is messy: duplicate rows, delayed fraud labels, and schema drift across months.
- Retraining budget supports a weekly batch retrain, not continuous online learning.
Deliverables
- Propose a full fraud modeling approach, including data cleaning and leakage prevention.
- Build a binary classification pipeline for messy, imbalanced tabular data.
- Define feature engineering steps that are realistic for real-time scoring.
- Choose evaluation metrics and thresholding strategy for analyst review capacity.
- Explain how you would validate, deploy, and monitor the model in production.