Segment and Predict Retail Customers

Business Context

ShopSphere, an online retail marketplace with 1.2M customers, wants to improve lifecycle marketing. The growth team needs both customer segments for campaign design and a purchase propensity model to predict whether a customer will buy in the next 30 days.

This question is designed to test whether you understand the practical difference between unsupervised learning (finding structure without labels) and supervised learning (predicting a known target), and how to use both on the same dataset.

Dataset

You are given a customer-level feature table built from the last 12 months of activity.

Feature Group	Count	Examples
Demographics	5	age_band, region, acquisition_channel, device_type
Behavioral	9	sessions_30d, avg_session_duration, pages_per_session, email_opens_90d
Transactional	8	orders_12m, avg_order_value, days_since_last_order, discount_usage_rate
Support	3	tickets_12m, refund_rate, csat_score
Target	1	purchased_next_30d

Size: 240K customers, 25 input features
Target: Binary label indicating whether the customer made a purchase in the next 30 days
Class balance: 18% positive, 82% negative
Missing data: 12% missing in csat_score, 6% missing in avg_session_duration, 3% missing in demographic fields

Success Criteria

A strong solution should:

Produce meaningful customer segments for marketing use
Train a supervised model with ROC-AUC >= 0.82 and PR-AUC >= 0.45 on the held-out test set
Clearly explain when to use clustering vs classification and the tradeoffs of each

Constraints

Marketing needs segments that are explainable to non-technical stakeholders
Batch scoring must finish in under 10 minutes for 240K customers
Retraining should be feasible monthly with standard Python tooling

Deliverables

Build an unsupervised learning pipeline to segment customers
Build a supervised learning pipeline to predict purchased_next_30d
Compare the two approaches and explain the difference in objective, inputs, and evaluation
Recommend how both outputs should be used together in production
Provide evaluation metrics, feature importance, and segment summaries

Business Context

Dataset

You are given a customer-level feature table built from the last 12 months of activity.

Feature Group	Count	Examples
Demographics	5	age_band, region, acquisition_channel, device_type
Behavioral	9	sessions_30d, avg_session_duration, pages_per_session, email_opens_90d
Transactional	8	orders_12m, avg_order_value, days_since_last_order, discount_usage_rate
Support	3	tickets_12m, refund_rate, csat_score
Target	1	purchased_next_30d

Size: 240K customers, 25 input features
Target: Binary label indicating whether the customer made a purchase in the next 30 days
Class balance: 18% positive, 82% negative
Missing data: 12% missing in csat_score, 6% missing in avg_session_duration, 3% missing in demographic fields

Success Criteria

A strong solution should:

Produce meaningful customer segments for marketing use
Train a supervised model with ROC-AUC >= 0.82 and PR-AUC >= 0.45 on the held-out test set
Clearly explain when to use clustering vs classification and the tradeoffs of each

Constraints

Marketing needs segments that are explainable to non-technical stakeholders
Batch scoring must finish in under 10 minutes for 240K customers
Retraining should be feasible monthly with standard Python tooling

Deliverables

Build an unsupervised learning pipeline to segment customers
Build a supervised learning pipeline to predict purchased_next_30d
Compare the two approaches and explain the difference in objective, inputs, and evaluation
Recommend how both outputs should be used together in production
Provide evaluation metrics, feature importance, and segment summaries

Business Context

Dataset

You are given a customer-level feature table built from the last 12 months of activity.

Feature Group	Count	Examples
Demographics	5	age_band, region, acquisition_channel, device_type
Behavioral	9	sessions_30d, avg_session_duration, pages_per_session, email_opens_90d
Transactional	8	orders_12m, avg_order_value, days_since_last_order, discount_usage_rate
Support	3	tickets_12m, refund_rate, csat_score
Target	1	purchased_next_30d

Size: 240K customers, 25 input features
Target: Binary label indicating whether the customer made a purchase in the next 30 days
Class balance: 18% positive, 82% negative
Missing data: 12% missing in csat_score, 6% missing in avg_session_duration, 3% missing in demographic fields

Success Criteria

A strong solution should:

Produce meaningful customer segments for marketing use
Train a supervised model with ROC-AUC >= 0.82 and PR-AUC >= 0.45 on the held-out test set
Clearly explain when to use clustering vs classification and the tradeoffs of each

Constraints

Marketing needs segments that are explainable to non-technical stakeholders
Batch scoring must finish in under 10 minutes for 240K customers
Retraining should be feasible monthly with standard Python tooling

Deliverables

Build an unsupervised learning pipeline to segment customers
Build a supervised learning pipeline to predict purchased_next_30d
Compare the two approaches and explain the difference in objective, inputs, and evaluation
Recommend how both outputs should be used together in production
Provide evaluation metrics, feature importance, and segment summaries

Business Context

Dataset

You are given a customer-level feature table built from the last 12 months of activity.

Feature Group	Count	Examples
Demographics	5	age_band, region, acquisition_channel, device_type
Behavioral	9	sessions_30d, avg_session_duration, pages_per_session, email_opens_90d
Transactional	8	orders_12m, avg_order_value, days_since_last_order, discount_usage_rate
Support	3	tickets_12m, refund_rate, csat_score
Target	1	purchased_next_30d

Size: 240K customers, 25 input features
Target: Binary label indicating whether the customer made a purchase in the next 30 days
Class balance: 18% positive, 82% negative
Missing data: 12% missing in csat_score, 6% missing in avg_session_duration, 3% missing in demographic fields

Success Criteria

A strong solution should:

Produce meaningful customer segments for marketing use
Train a supervised model with ROC-AUC >= 0.82 and PR-AUC >= 0.45 on the held-out test set
Clearly explain when to use clustering vs classification and the tradeoffs of each

Constraints

Marketing needs segments that are explainable to non-technical stakeholders
Batch scoring must finish in under 10 minutes for 240K customers
Retraining should be feasible monthly with standard Python tooling

Deliverables

Build an unsupervised learning pipeline to segment customers
Build a supervised learning pipeline to predict purchased_next_30d
Compare the two approaches and explain the difference in objective, inputs, and evaluation
Recommend how both outputs should be used together in production
Provide evaluation metrics, feature importance, and segment summaries

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Segment and Predict Retail Customers

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Segment and Predict Retail Customers

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Segment and Predict Retail Customers

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer