Business Context
TechSolutions, a SaaS provider with 40K active users generating $80M ARR, faces a rising churn rate of 5% per month. This churn significantly impacts revenue, and the Chief Financial Officer (CFO) is seeking a predictive model to identify at-risk customers, allowing proactive retention efforts.
Dataset
| Feature Group | Count | Examples |
|---|
| Usage Metrics | 20 | logins_per_week, feature_usage_score, avg_session_duration |
| Subscription Info | 10 | plan_type, subscription_length, payment_method |
| Customer Profile | 8 | industry, company_size, account_age_days |
- Size: 200K customer-months (24 months of data), 38 features
- Target: Binary — churned within the next month (1) vs retained (0)
- Class balance: 5% positive (churned), 95% negative (retained)
- Missing data: 10% missing in feature_usage_score, 2% in subscription_length
Requirements
- Develop a classification model to predict customer churn within the next month.
- Achieve at least 75% recall with precision above 60%.
- Provide insights on feature importance to guide retention strategies.
- Address the imbalanced class distribution effectively.
- Justify your choice of model and evaluation metrics.
Constraints
- The model must be interpretable to allow the CFO to understand the key drivers of churn.
- Inference should occur daily for 40K accounts, ensuring timely intervention.
- The model should be updated quarterly with the latest data.