You’re on the Risk Modeling team at LendFlow, a fintech lender offering instant point-of-sale loans for e-commerce checkouts. The model’s prediction is used to (a) approve/decline applications and (b) set credit limits. LendFlow processes ~2.5M applications/month across the US and EU, and a 10 bps degradation in default rate translates to ~$4–6M/year in charge-offs. Regulators and internal audit require that decisions are explainable and that the model is monitored for drift and bias.
Your current baseline is a tuned logistic regression. Leadership wants you to evaluate whether a Deep Neural Network (DNN) can outperform XGBoost on this tabular dataset, and if so, whether the operational and compliance trade-offs are worth it.
You have 12 months of historical applications with outcomes observed over a 90-day window.
| Feature Group | Count | Examples | Notes |
|---|---|---|---|
| Applicant demographics | 8 | age_bucket, region, employment_status | Some features restricted in EU; must support feature gating |
| Credit bureau aggregates | 22 | tradelines_count, utilization_pct, delinquencies_12m | Strong predictors; missing for ~6% (thin-file) |
| Transaction & bank-link | 18 | avg_balance_30d, inflow_std_90d, nsf_events_90d | Missing for ~35% (user didn’t link bank) |
| Merchant & product | 10 | merchant_category, cart_amount, sku_risk_score | High-cardinality categorical |
| Device & fraud signals | 12 | device_age_days, ip_risk_score, velocity_1h | Noisy; distribution shifts during promos |
default_90d (1 if charged-off/90+ DPD within 90 days, else 0)You’re on the Risk Modeling team at LendFlow, a fintech lender offering instant point-of-sale loans for e-commerce checkouts. The model’s prediction is used to (a) approve/decline applications and (b) set credit limits. LendFlow processes ~2.5M applications/month across the US and EU, and a 10 bps degradation in default rate translates to ~$4–6M/year in charge-offs. Regulators and internal audit require that decisions are explainable and that the model is monitored for drift and bias.
Your current baseline is a tuned logistic regression. Leadership wants you to evaluate whether a Deep Neural Network (DNN) can outperform XGBoost on this tabular dataset, and if so, whether the operational and compliance trade-offs are worth it.
You have 12 months of historical applications with outcomes observed over a 90-day window.
| Feature Group | Count | Examples | Notes |
|---|---|---|---|
| Applicant demographics | 8 | age_bucket, region, employment_status | Some features restricted in EU; must support feature gating |
| Credit bureau aggregates | 22 | tradelines_count, utilization_pct, delinquencies_12m | Strong predictors; missing for ~6% (thin-file) |
| Transaction & bank-link | 18 | avg_balance_30d, inflow_std_90d, nsf_events_90d | Missing for ~35% (user didn’t link bank) |
| Merchant & product | 10 | merchant_category, cart_amount, sku_risk_score | High-cardinality categorical |
| Device & fraud signals | 12 | device_age_days, ip_risk_score, velocity_1h | Noisy; distribution shifts during promos |
default_90d (1 if charged-off/90+ DPD within 90 days, else 0)