Northstar Mutual wants a model to flag potentially high-risk auto insurance claims for manual review before payout. The data is ambiguous: several fields are incomplete at claim creation, some features are weak proxies for downstream outcomes, and label quality is imperfect because fraud investigations are not always conclusive.
You are given a historical claims dataset collected over 24 months.
| Feature Group | Count | Examples |
|---|---|---|
| Claim attributes | 12 | claim_amount, incident_type, injury_reported, police_report_filed |
| Policyholder profile | 10 | policy_tenure_months, prior_claim_count, vehicle_age, region |
| Behavioral / process | 8 | time_to_report_hours, document_count_48h, channel_submitted, adjuster_reassignments |
| Derived temporal | 6 | claim_created_month, weekend_flag, holiday_flag, days_since_policy_change |
| External / noisy | 4 | weather_severity_score, repair_shop_risk_band, geocode_risk_score, credit_band |
A good solution should achieve strong ranking quality while remaining explainable enough for operations and compliance teams. Target PR-AUC >= 0.42, ROC-AUC >= 0.80, and precision >= 0.45 at 70% recall on a held-out time-based test set.