Business Context
AIG Claims receives customer feedback from post-claim surveys, call-center notes, and web form comments. The claims operations team wants an NLP solution that classifies sentiment so they can identify dissatisfied claimants early and prioritize service recovery.
Data
You are given historical feedback collected across AIG Claims channels.
- Volume: ~450,000 feedback records over 18 months
- Text length: 5-300 words, median 42 words
- Language: English only for the first version
- Labels:
negative (22%), neutral (46%), positive (32%)
- Sources: survey free-text, FNOL follow-up comments, adjuster interaction notes, complaint summaries
- Data quality: misspellings, abbreviations, policy numbers, claim IDs, agent names, and templated text are common
Success Criteria
A good solution should achieve macro-F1 >= 0.80, negative-class recall >= 0.88, and support batch scoring of at least 50,000 comments per hour. Predictions should be explainable enough for claims managers to review common drivers of negative sentiment.
Constraints
- Customer and claim identifiers must be removed before modeling
- The first production version should run in AIG Claims' secure environment without external API calls
- Inference should be lightweight enough for daily batch jobs and optional near-real-time scoring
Requirements
- Build a 3-class sentiment classifier for AIG Claims customer feedback.
- Design a preprocessing pipeline that handles insurance-specific noise such as claim numbers, adjuster names, and boilerplate phrases.
- Implement a strong baseline in Python using modern NLP tooling.
- Explain how you would handle class imbalance and ambiguous feedback.
- Evaluate the model with appropriate metrics and propose an error-analysis plan.
- Describe how you would surface results to claims operations, including examples of actionable negative feedback themes.