Bayesian Fraud Detection for Loan Approvals

Business Context

LendWise uses an ML model to flag potentially fraudulent loan applications before approval. Fraud is rare, so the risk team wants to understand how Bayes' theorem and conditional probability translate a model alert into an actual probability of fraud.

Problem Statement

A binary classifier labels applications as Flagged or Not Flagged. You need to compute the posterior probability that an application is truly fraudulent given that the model flagged it, and explain how this would be used in a practical ML decision pipeline.

Given Data

Metric	Value
Daily applications	100,000
Base fraud rate	0.8%
Model sensitivity: $P(\text{Flagged} \mid \text{Fraud})$	92%
Model false positive rate: $P(\text{Flagged} \mid \text{Not Fraud})$	4.5%
Manual review cost per flagged application	$3.20
Expected loss if fraud is approved	$1,800
Significance threshold for escalation decision	20% posterior fraud probability

Requirements

State Bayes' theorem in this setting.
Compute the probability an application is flagged: $P(\text{Flagged})$ .
Compute the posterior probability of fraud given a flag: $P(\text{Fraud} \mid \text{Flagged})$ .
Calculate the expected number of flagged applications and true fraud cases among them per day.
Decide whether a flagged application should automatically go to manual review if the escalation threshold is 20% posterior fraud probability.
Briefly explain how conditional probability should influence threshold selection in an ML system with class imbalance.

Assumptions

The sensitivity and false positive rate are stable and estimated from recent holdout data.
Each application is independent.
Fraud prevalence remains at 0.8% during deployment.
Manual review perfectly blocks fraud once escalated.

Business Context

Problem Statement

Given Data

Metric	Value
Daily applications	100,000
Base fraud rate	0.8%
Model sensitivity: $P(\text{Flagged} \mid \text{Fraud})$	92%
Model false positive rate: $P(\text{Flagged} \mid \text{Not Fraud})$	4.5%
Manual review cost per flagged application	$3.20
Expected loss if fraud is approved	$1,800
Significance threshold for escalation decision	20% posterior fraud probability

Requirements

State Bayes' theorem in this setting.
Compute the probability an application is flagged: $P(\text{Flagged})$ .
Compute the posterior probability of fraud given a flag: $P(\text{Fraud} \mid \text{Flagged})$ .
Calculate the expected number of flagged applications and true fraud cases among them per day.
Decide whether a flagged application should automatically go to manual review if the escalation threshold is 20% posterior fraud probability.
Briefly explain how conditional probability should influence threshold selection in an ML system with class imbalance.

Assumptions

The sensitivity and false positive rate are stable and estimated from recent holdout data.
Each application is independent.
Fraud prevalence remains at 0.8% during deployment.
Manual review perfectly blocks fraud once escalated.

Business Context

Problem Statement

Given Data

Metric	Value
Daily applications	100,000
Base fraud rate	0.8%
Model sensitivity: $P(\text{Flagged} \mid \text{Fraud})$	92%
Model false positive rate: $P(\text{Flagged} \mid \text{Not Fraud})$	4.5%
Manual review cost per flagged application	$3.20
Expected loss if fraud is approved	$1,800
Significance threshold for escalation decision	20% posterior fraud probability

Requirements

State Bayes' theorem in this setting.
Compute the probability an application is flagged: $P(\text{Flagged})$ .
Compute the posterior probability of fraud given a flag: $P(\text{Fraud} \mid \text{Flagged})$ .
Calculate the expected number of flagged applications and true fraud cases among them per day.
Decide whether a flagged application should automatically go to manual review if the escalation threshold is 20% posterior fraud probability.
Briefly explain how conditional probability should influence threshold selection in an ML system with class imbalance.

Assumptions

The sensitivity and false positive rate are stable and estimated from recent holdout data.
Each application is independent.
Fraud prevalence remains at 0.8% during deployment.
Manual review perfectly blocks fraud once escalated.

Business Context

Problem Statement

Given Data

Metric	Value
Daily applications	100,000
Base fraud rate	0.8%
Model sensitivity: $P(\text{Flagged} \mid \text{Fraud})$	92%
Model false positive rate: $P(\text{Flagged} \mid \text{Not Fraud})$	4.5%
Manual review cost per flagged application	$3.20
Expected loss if fraud is approved	$1,800
Significance threshold for escalation decision	20% posterior fraud probability

Requirements

State Bayes' theorem in this setting.
Compute the probability an application is flagged: $P(\text{Flagged})$ .
Compute the posterior probability of fraud given a flag: $P(\text{Fraud} \mid \text{Flagged})$ .
Calculate the expected number of flagged applications and true fraud cases among them per day.
Decide whether a flagged application should automatically go to manual review if the escalation threshold is 20% posterior fraud probability.
Briefly explain how conditional probability should influence threshold selection in an ML system with class imbalance.

Assumptions

The sensitivity and false positive rate are stable and estimated from recent holdout data.
Each application is independent.
Fraud prevalence remains at 0.8% during deployment.
Manual review perfectly blocks fraud once escalated.

Interview Guides

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Bayesian Fraud Detection for Loan Approvals

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer

Bayesian Fraud Detection for Loan Approvals

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Bayesian Fraud Detection for Loan Approvals

Business Context

Problem Statement

Given Data

Requirements

Assumptions

Your Answer