ShopSphere, an omnichannel retail platform with 1.2M active customers, wants to improve marketing efficiency. The growth team needs both customer segments for campaign design and a purchase propensity model to predict which users will buy in the next 30 days.
This problem is designed to test whether the candidate understands the practical difference between unsupervised learning (discovering structure without labels) and supervised learning (predicting a known target using labeled data), and can implement both on the same dataset.
You are given a customer-level dataset built from the last 12 months of activity.
| Feature Group | Count | Examples |
|---|---|---|
| Demographics | 5 | age_bucket, region, acquisition_channel |
| Transaction behavior | 10 | total_orders_90d, avg_order_value, return_rate |
| Engagement | 8 | app_sessions_30d, email_open_rate, days_since_last_visit |
| Product affinity | 6 | pct_spend_electronics, pct_spend_home, distinct_categories |
| Target label | 1 | purchased_next_30d |
purchased_next_30d = 1 if the customer makes at least one purchase in the next 30 daysA strong solution should:
purchased_next_30d.