ShopSphere, an e-commerce marketplace with 2.5M active customers, wants to group customers into actionable segments for lifecycle marketing, merchandising, and CRM targeting. There is no labeled target, so the task is to build an unsupervised clustering solution and explain when clustering is appropriate versus supervised learning.
You are given a customer-level feature table built from the last 12 months of activity.
| Feature Group | Count | Examples |
|---|---|---|
| Transaction behavior | 8 | orders_12m, avg_order_value, return_rate, discount_share |
| Engagement | 6 | sessions_30d, app_opens_30d, email_click_rate, wishlist_adds |
| Product preferences | 7 | pct_fashion, pct_electronics, pct_home, category_entropy |
| Geography & account | 5 | region, tenure_days, acquisition_channel, loyalty_tier |
| Temporal patterns | 4 | days_since_last_order, weekend_order_share, seasonality_index |
A good solution should produce stable, interpretable segments that marketing can use in campaigns. Aim for clusters with silhouette score >= 0.20, low drift month over month, and clear business meaning (for example, high-value loyalists, discount-driven browsers, seasonal shoppers).