PulseBoard, a mid-market SaaS analytics company, wants a 30-day churn prediction model for 120K customer accounts. The retention team needs a model that is accurate enough to prioritize outreach but also explainable enough to justify interventions.
You are given an account-level monthly training set built from product usage, billing, support, and account metadata.
| Feature Group | Count | Examples |
|---|---|---|
| Usage metrics | 14 | weekly_active_users, report_exports_30d, sessions_per_user |
| Billing | 6 | mrr, payment_failures_90d, plan_tier |
| Support | 5 | open_tickets_30d, avg_resolution_hours, csat_score |
| Account profile | 7 | company_size, industry, region, contract_type |
| Temporal / derived | 10 | days_since_last_login, usage_change_30d, ticket_rate_per_user |
A good solution should demonstrate whether a regularized logistic regression baseline is sufficient or whether a more complex model such as gradient boosting materially improves business outcomes. “Good enough” means achieving ROC-AUC >= 0.82, PR-AUC >= 0.42, and providing a clear recommendation based on both model quality and operational constraints.