Triage Noisy Insurance Claims Risk

Business Context

Northstar Mutual wants a model to flag potentially high-risk auto insurance claims for manual review before payout. The data is ambiguous: several fields are incomplete at claim creation, some features are weak proxies for downstream outcomes, and label quality is imperfect because fraud investigations are not always conclusive.

Dataset

You are given a historical claims dataset collected over 24 months.

Feature Group	Count	Examples
Claim attributes	12	claim_amount, incident_type, injury_reported, police_report_filed
Policyholder profile	10	policy_tenure_months, prior_claim_count, vehicle_age, region
Behavioral / process	8	time_to_report_hours, document_count_48h, channel_submitted, adjuster_reassignments
Derived temporal	6	claim_created_month, weekend_flag, holiday_flag, days_since_policy_change
External / noisy	4	weather_severity_score, repair_shop_risk_band, geocode_risk_score, credit_band

Size: 148K claims, 40 candidate features
Target: Binary — claim later confirmed as high-risk for manual escalation (1) vs standard processing (0)
Class balance: 11.6% positive, 88.4% negative
Missing data: 18% missing in external/vendor fields, 9% missing in behavioral fields, and some categorical levels drift over time

Success Criteria

A good solution should achieve strong ranking quality while remaining explainable enough for operations and compliance teams. Target PR-AUC >= 0.42, ROC-AUC >= 0.80, and precision >= 0.45 at 70% recall on a held-out time-based test set.

Constraints

Predictions must be generated in under 50 ms per claim in an online API.
The final model must be explainable at both global and per-claim levels.
Avoid data leakage from post-claim investigation fields.
Retraining is allowed monthly; feature computation must be reproducible in batch and online settings.

Deliverables

Define a feature selection strategy for ambiguous, partially noisy data.
Build at least one baseline and one stronger production candidate model.
Design a validation plan that accounts for temporal drift and ambiguous labels.
Report metrics, threshold selection logic, and feature importance.
Explain how you would detect overfitting, leakage, and unstable features before deployment.

Business Context

Dataset

You are given a historical claims dataset collected over 24 months.

Feature Group	Count	Examples
Claim attributes	12	claim_amount, incident_type, injury_reported, police_report_filed
Policyholder profile	10	policy_tenure_months, prior_claim_count, vehicle_age, region
Behavioral / process	8	time_to_report_hours, document_count_48h, channel_submitted, adjuster_reassignments
Derived temporal	6	claim_created_month, weekend_flag, holiday_flag, days_since_policy_change
External / noisy	4	weather_severity_score, repair_shop_risk_band, geocode_risk_score, credit_band

Size: 148K claims, 40 candidate features
Target: Binary — claim later confirmed as high-risk for manual escalation (1) vs standard processing (0)
Class balance: 11.6% positive, 88.4% negative
Missing data: 18% missing in external/vendor fields, 9% missing in behavioral fields, and some categorical levels drift over time

Success Criteria

Constraints

Predictions must be generated in under 50 ms per claim in an online API.
The final model must be explainable at both global and per-claim levels.
Avoid data leakage from post-claim investigation fields.
Retraining is allowed monthly; feature computation must be reproducible in batch and online settings.

Deliverables

Define a feature selection strategy for ambiguous, partially noisy data.
Build at least one baseline and one stronger production candidate model.
Design a validation plan that accounts for temporal drift and ambiguous labels.
Report metrics, threshold selection logic, and feature importance.
Explain how you would detect overfitting, leakage, and unstable features before deployment.

Business Context

Dataset

You are given a historical claims dataset collected over 24 months.

Feature Group	Count	Examples
Claim attributes	12	claim_amount, incident_type, injury_reported, police_report_filed
Policyholder profile	10	policy_tenure_months, prior_claim_count, vehicle_age, region
Behavioral / process	8	time_to_report_hours, document_count_48h, channel_submitted, adjuster_reassignments
Derived temporal	6	claim_created_month, weekend_flag, holiday_flag, days_since_policy_change
External / noisy	4	weather_severity_score, repair_shop_risk_band, geocode_risk_score, credit_band

Size: 148K claims, 40 candidate features
Target: Binary — claim later confirmed as high-risk for manual escalation (1) vs standard processing (0)
Class balance: 11.6% positive, 88.4% negative
Missing data: 18% missing in external/vendor fields, 9% missing in behavioral fields, and some categorical levels drift over time

Success Criteria

Constraints

Predictions must be generated in under 50 ms per claim in an online API.
The final model must be explainable at both global and per-claim levels.
Avoid data leakage from post-claim investigation fields.
Retraining is allowed monthly; feature computation must be reproducible in batch and online settings.

Deliverables

Define a feature selection strategy for ambiguous, partially noisy data.
Build at least one baseline and one stronger production candidate model.
Design a validation plan that accounts for temporal drift and ambiguous labels.
Report metrics, threshold selection logic, and feature importance.
Explain how you would detect overfitting, leakage, and unstable features before deployment.

Business Context

Dataset

You are given a historical claims dataset collected over 24 months.

Feature Group	Count	Examples
Claim attributes	12	claim_amount, incident_type, injury_reported, police_report_filed
Policyholder profile	10	policy_tenure_months, prior_claim_count, vehicle_age, region
Behavioral / process	8	time_to_report_hours, document_count_48h, channel_submitted, adjuster_reassignments
Derived temporal	6	claim_created_month, weekend_flag, holiday_flag, days_since_policy_change
External / noisy	4	weather_severity_score, repair_shop_risk_band, geocode_risk_score, credit_band

Size: 148K claims, 40 candidate features
Target: Binary — claim later confirmed as high-risk for manual escalation (1) vs standard processing (0)
Class balance: 11.6% positive, 88.4% negative
Missing data: 18% missing in external/vendor fields, 9% missing in behavioral fields, and some categorical levels drift over time

Success Criteria

Constraints

Predictions must be generated in under 50 ms per claim in an online API.
The final model must be explainable at both global and per-claim levels.
Avoid data leakage from post-claim investigation fields.
Retraining is allowed monthly; feature computation must be reproducible in batch and online settings.

Deliverables

Define a feature selection strategy for ambiguous, partially noisy data.
Build at least one baseline and one stronger production candidate model.
Design a validation plan that accounts for temporal drift and ambiguous labels.
Report metrics, threshold selection logic, and feature importance.
Explain how you would detect overfitting, leakage, and unstable features before deployment.

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Triage Noisy Insurance Claims Risk

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Triage Noisy Insurance Claims Risk

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Triage Noisy Insurance Claims Risk

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer