ShieldSure, a mid-sized digital insurer with 420K active auto and home policies, wants to predict which customers will churn at renewal so the retention team can intervene before the policy lapses. The model will be used in a weekly batch workflow to prioritize outbound offers and agent follow-up.
The training data is built at the policy-renewal opportunity level. Each row represents a policy 45 days before its renewal date, with features aggregated from the prior 12 months.
| Feature Group | Count | Examples |
|---|---|---|
| Policy & pricing | 12 | premium_amount, premium_change_pct, deductible, coverage_type |
| Customer profile | 9 | tenure_months, age_band, state, bundled_products |
| Claims history | 8 | claim_count_12m, total_claim_cost_12m, recent_claim_flag |
| Billing & payment | 7 | autopay_flag, late_payment_count, payment_method |
| Engagement | 6 | app_logins_90d, email_open_rate, agent_contacts_180d |
| Service interactions | 5 | complaint_count, call_center_contacts, NPS_bucket |
A good solution should identify high-risk policies early enough for retention outreach, achieve strong ranking quality, and remain interpretable for pricing and operations stakeholders. Target performance is ROC-AUC > 0.82 and recall > 70% at precision >= 35% on the holdout period.