Explain Accuracy vs False Positives

Context

ShopShield runs a binary classification model that flags suspicious e-commerce orders for manual review before fulfillment. The customer success team says enterprise merchants keep focusing on the model's 96.2% accuracy, while complaining that too many legitimate orders are being flagged and delayed.

Current Performance

Metric	Current Model	Previous Threshold
Accuracy	96.2%	97.1%
Precision	41.7%	55.4%
Recall	78.1%	61.3%
F1 Score	54.3%	58.0%
False Positive Rate	3.4%	1.8%
Orders flagged for review	3,000 / 50,000	1,850 / 50,000

Confusion Matrix Snapshot

	Predicted Fraud	Predicted Legitimate
Actual Fraud	625	175
Actual Legitimate	2,375	46,825

The Problem

A large merchant asks: "If your model is over 96% accurate, why are 2,375 good orders being flagged?" You need to explain this clearly, quantify the tradeoff, and recommend whether the current threshold is appropriate.

Requirements

Explain why high accuracy can coexist with a large number of false positives.
Interpret the confusion matrix in customer-friendly business terms.
Compare the current threshold with the previous threshold and describe the tradeoff.
Recommend how you would communicate the right primary metric for this use case.
Suggest concrete steps to reduce false positives without sharply increasing missed fraud.

Constraints

Fraud prevalence is low: 800 fraudulent orders out of 50,000 daily orders.
Each false positive delays shipment and costs the merchant about $4 in support and handling.
Each false negative costs about $120 in fraud loss.
Manual review capacity is capped at 3,200 orders per day.

Problem

Context

Current Performance

Metric	Current Model	Previous Threshold
Accuracy	96.2%	97.1%
Precision	41.7%	55.4%
Recall	78.1%	61.3%
F1 Score	54.3%	58.0%
False Positive Rate	3.4%	1.8%
Orders flagged for review	3,000 / 50,000	1,850 / 50,000

Confusion Matrix Snapshot

	Predicted Fraud	Predicted Legitimate
Actual Fraud	625	175
Actual Legitimate	2,375	46,825

The Problem

Requirements

Explain why high accuracy can coexist with a large number of false positives.
Interpret the confusion matrix in customer-friendly business terms.
Compare the current threshold with the previous threshold and describe the tradeoff.
Recommend how you would communicate the right primary metric for this use case.
Suggest concrete steps to reduce false positives without sharply increasing missed fraud.

Constraints

Fraud prevalence is low: 800 fraudulent orders out of 50,000 daily orders.
Each false positive delays shipment and costs the merchant about $4 in support and handling.
Each false negative costs about $120 in fraud loss.
Manual review capacity is capped at 3,200 orders per day.

Problem

Context

Current Performance

Metric	Current Model	Previous Threshold
Accuracy	96.2%	97.1%
Precision	41.7%	55.4%
Recall	78.1%	61.3%
F1 Score	54.3%	58.0%
False Positive Rate	3.4%	1.8%
Orders flagged for review	3,000 / 50,000	1,850 / 50,000

Confusion Matrix Snapshot

	Predicted Fraud	Predicted Legitimate
Actual Fraud	625	175
Actual Legitimate	2,375	46,825

The Problem

Requirements

Explain why high accuracy can coexist with a large number of false positives.
Interpret the confusion matrix in customer-friendly business terms.
Compare the current threshold with the previous threshold and describe the tradeoff.
Recommend how you would communicate the right primary metric for this use case.
Suggest concrete steps to reduce false positives without sharply increasing missed fraud.

Constraints

Fraud prevalence is low: 800 fraudulent orders out of 50,000 daily orders.
Each false positive delays shipment and costs the merchant about $4 in support and handling.
Each false negative costs about $120 in fraud loss.
Manual review capacity is capped at 3,200 orders per day.

Problem

Context

Current Performance

Metric	Current Model	Previous Threshold
Accuracy	96.2%	97.1%
Precision	41.7%	55.4%
Recall	78.1%	61.3%
F1 Score	54.3%	58.0%
False Positive Rate	3.4%	1.8%
Orders flagged for review	3,000 / 50,000	1,850 / 50,000

Confusion Matrix Snapshot

	Predicted Fraud	Predicted Legitimate
Actual Fraud	625	175
Actual Legitimate	2,375	46,825

The Problem

Requirements

Explain why high accuracy can coexist with a large number of false positives.
Interpret the confusion matrix in customer-friendly business terms.
Compare the current threshold with the previous threshold and describe the tradeoff.
Recommend how you would communicate the right primary metric for this use case.
Suggest concrete steps to reduce false positives without sharply increasing missed fraud.

Constraints

Fraud prevalence is low: 800 fraudulent orders out of 50,000 daily orders.
Each false positive delays shipment and costs the merchant about $4 in support and handling.
Each false negative costs about $120 in fraud loss.
Manual review capacity is capped at 3,200 orders per day.

Interview Guides

Problem

Context

Current Performance

Confusion Matrix Snapshot

The Problem

Requirements

Constraints

Problem

Context

Current Performance

Confusion Matrix Snapshot

The Problem

Requirements

Constraints

Explain Accuracy vs False Positives

Problem

Context

Current Performance

Confusion Matrix Snapshot

The Problem

Requirements

Constraints

Problem

Context

Current Performance

Confusion Matrix Snapshot

The Problem

Requirements

Constraints