Classify Ecommerce Reviews by Intent

Business Context

ShopSphere, a large ecommerce marketplace, receives thousands of customer reviews and post-purchase comments every day. The support and product teams want a model that automatically classifies each text into actionable intent categories so issues can be routed faster and trend analysis can be automated.

Data Characteristics

You have 1.8 million labeled English reviews collected over 18 months. Each record contains free-form text, a product category, and one of four labels: Product Quality, Delivery Issue, Billing/Refund, or General Praise. Text length ranges from 5 to 350 words with a median of 42 words. The class distribution is moderately imbalanced: Product Quality (38%), Delivery Issue (24%), Billing/Refund (14%), General Praise (24%). The data includes misspellings, emojis, repeated punctuation, HTML fragments, and copied order metadata.

Success Criteria

A strong solution should achieve macro-F1 >= 0.84 and recall >= 0.90 on the Billing/Refund class, since missed refund complaints create operational risk. The model should support batch scoring and near-real-time inference for new reviews.

Constraints

Inference latency should stay under 80 ms per review in production.
The solution must run on a single GPU or a CPU-only fallback path.
The pipeline should be reproducible and easy to retrain weekly.

Requirements

Build a multi-class text classification pipeline for the four intent labels.
Describe the preprocessing steps needed for noisy ecommerce text.
Implement a strong baseline and a transformer-based model in modern Python.
Explain how you would handle class imbalance and thresholding.
Define an evaluation plan, including offline metrics and error analysis.
Briefly discuss deployment considerations such as latency, monitoring, and data drift.

Business Context

Data Characteristics

Requirements

Build a multi-class text classification pipeline for the four intent labels.

Describe the preprocessing steps needed for noisy ecommerce text.

Implement a strong baseline and a transformer-based model in modern Python.

Explain how you would handle class imbalance and thresholding.

Define an evaluation plan, including offline metrics and error analysis.

Briefly discuss deployment considerations such as latency, monitoring, and data drift.

Business Context

Data Characteristics

Requirements

Build a multi-class text classification pipeline for the four intent labels.

Describe the preprocessing steps needed for noisy ecommerce text.

Implement a strong baseline and a transformer-based model in modern Python.

Explain how you would handle class imbalance and thresholding.

Define an evaluation plan, including offline metrics and error analysis.

Briefly discuss deployment considerations such as latency, monitoring, and data drift.

Business Context

Data Characteristics

Requirements

Build a multi-class text classification pipeline for the four intent labels.

Describe the preprocessing steps needed for noisy ecommerce text.

Implement a strong baseline and a transformer-based model in modern Python.

Explain how you would handle class imbalance and thresholding.

Define an evaluation plan, including offline metrics and error analysis.

Briefly discuss deployment considerations such as latency, monitoring, and data drift.

Interview Guides

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Classify Ecommerce Reviews by Intent

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Your Answer

Classify Ecommerce Reviews by Intent

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Classify Ecommerce Reviews by Intent

Business Context

Data Characteristics

Success Criteria

Constraints

Requirements

Your Answer