Business Context
ShopNow, an e-commerce marketplace with 2M monthly users, wants to improve marketing efficiency. The growth team needs both a supervised model to predict whether a user will purchase in the next 7 days and an unsupervised model to discover customer segments for campaign targeting.
Dataset
You are given a user-level dataset built from the last 12 months of web and app activity.
| Feature Group | Count | Examples |
|---|
| Behavioral metrics | 12 | sessions_last_7d, avg_session_duration, pages_viewed, cart_add_rate |
| Transaction history | 8 | orders_last_90d, avg_order_value, refund_rate, days_since_last_purchase |
| Marketing engagement | 6 | email_open_rate, push_click_rate, coupon_redemptions |
| Customer profile | 7 | country, device_type, acquisition_channel, loyalty_tier |
| Derived temporal features | 5 | weekend_activity_ratio, evening_session_share, recency_score |
- Size: 240K users, 38 features
- Target for supervised task:
purchased_next_7d (1 if user purchases in the next 7 days, else 0)
- Class balance: 18% positive, 82% negative
- Missing data: ~10% missing in marketing engagement fields, ~4% missing in profile fields for guest users
Success Criteria
- Build a supervised model with ROC-AUC >= 0.82 and F1 >= 0.55 on the holdout set.
- Produce an unsupervised segmentation with 3-8 actionable clusters and silhouette score >= 0.20.
- Clearly explain when supervised learning is appropriate versus when unsupervised learning is appropriate for this dataset.
Constraints
- Predictions must run in a daily batch job over 240K users in under 10 minutes.
- Marketing stakeholders need interpretable outputs: top purchase drivers and human-readable segment profiles.
- Retraining budget is limited to weekly runs on a single standard CPU machine.
Deliverables
- Train and evaluate one supervised model for purchase prediction.
- Train and evaluate one unsupervised model for customer segmentation.
- Compare supervised vs. unsupervised learning in the context of this dataset and business goal.
- Describe preprocessing, feature engineering, and validation strategy.
- Recommend how both outputs would be used together in production marketing workflows.