Salesforce wants to improve lead prioritization in Sales Cloud by predicting whether a newly created B2B lead will convert to an opportunity within 30 days. The sales operations team needs a model that generalizes well across regions and campaign types, not one that only fits historical noise.
You are given a historical lead-conversion dataset extracted from Sales Cloud and Marketing Cloud engagement logs.
| Feature Group | Count | Examples |
|---|---|---|
| Lead attributes | 10 | industry, employee_count, country, lead_source, annual_revenue |
| Engagement features | 8 | email_opens_7d, form_submits_30d, web_visits_14d, campaign_click_rate |
| Sales activity | 6 | call_attempts_7d, email_touches_14d, days_to_first_contact |
| Derived behavioral features | 6 | engagement_trend_14d, touches_per_day, recency_score |
A good solution should demonstrate the bias-variance tradeoff in practice by comparing at least one high-bias model and one high-variance model, then selecting a balanced approach that performs best on unseen data. Target performance is AUC-ROC >= 0.82, PR-AUC >= 0.50, and less than 3 percentage points gap between validation and test AUC-ROC.