NimbusCRM, a mid-market SaaS company, wants a lead-scoring model to predict whether a trial account will convert to a paid subscription within 14 days. The sales team needs a model that generalizes well to new campaigns because recent experiments added many correlated behavioral and marketing features.
You are given a historical training set of trial accounts collected over 18 months.
| Feature Group | Count | Examples |
|---|---|---|
| Product usage | 18 | sessions_first_7d, invites_sent, reports_created, active_days |
| Marketing attribution | 9 | channel, campaign_id, ad_platform, landing_page |
| Firmographic | 11 | company_size, industry, region, employee_count |
| Sales interactions | 7 | demo_booked, emails_opened, calls_completed, response_time_hours |
| Engineered / sparse flags | 15 | feature_clicked_* indicators, promo_code_used, webinar_attended |
A good solution should improve generalization versus an unregularized baseline, achieve test AUC-ROC >= 0.82, and keep the train-test performance gap under 0.03. The final model should also provide interpretable coefficients or feature importance for go-to-market stakeholders.