Business Context
NovaTel, a regional telecom provider with 2.3M subscribers, wants to predict which customers will churn in the next 30 days so the retention team can target save offers efficiently. The model will be scored weekly and used to prioritize outreach for the top-risk customers.
Dataset
You are given a customer-level training dataset built from the last 18 months of subscriber history.
| Feature Group | Count | Examples |
|---|
| Usage | 14 | avg_call_minutes_30d, data_gb_30d, dropped_calls_rate, roaming_days |
| Billing | 9 | monthly_charge, payment_method, late_payments_90d, autopay_enabled |
| Contract | 6 | plan_type, tenure_months, contract_term, add_on_count |
| Support | 7 | support_tickets_90d, complaint_flag, avg_resolution_hours |
| Engagement | 6 | app_logins_30d, web_portal_visits, promo_clicks |
| Geography / Demographics | 5 | region, device_type, acquisition_channel |
- Historical table contains one row per customer per month.
- The modeling snapshot has 240K customer-months and 47 features.
- Target is whether the customer churns within the next 30 days.
- Churn is relatively rare and missing values are present in support and engagement fields.
Success Criteria
A good solution should achieve strong ranking quality and be usable by an operations team with limited outreach capacity. Target performance is:
- Recall >= 75% at precision >= 40% for the intervention threshold
- ROC-AUC >= 0.84
- Clear feature importance or reason codes for flagged customers
Constraints
- Weekly batch inference on ~2.3M customers must finish in under 30 minutes.
- The retention team can contact only the top 8-10% highest-risk customers.
- The solution should be explainable enough for business stakeholders and compliant with standard customer analytics governance.
Deliverables
- Build a churn prediction pipeline with preprocessing, training, and evaluation.
- Explain model choice, leakage risks, and validation strategy.
- Show how you would handle class imbalance and missing data.
- Propose thresholding logic for retention outreach capacity.
- Provide feature importance and monitoring recommendations for production.