Business Context
Meta Integrity needs a binary classifier to predict whether a newly created Facebook ad account will trigger a policy enforcement action within 14 days. You must decide when a regularized logistic regression is preferable to a tree-based model for this production use case, then build and evaluate both.
Dataset
You are given a tabular dataset of ad-account snapshots collected at account creation time.
| Feature Group | Count | Examples |
|---|
| Numeric activity | 14 | account_age_hours, payment_attempts_24h, spend_velocity, device_count |
| Categorical metadata | 9 | country, business_type, payment_method_type, signup_surface |
| Aggregated integrity signals | 8 | prior_disabled_assets, linked_entity_risk_score, ip_reputation_bucket |
| Simple interaction candidates | 5 | spend_per_device, attempts_per_payment_method, linked_assets_per_admin |
- Size: 420K ad accounts, 36 features
- Target: Binary — enforcement within 14 days (1) vs no enforcement (0)
- Class balance: 8.7% positive, 91.3% negative
- Missing data: 12% missing in linked-entity signals, 4% missing in payment metadata
Success Criteria
A solution is good enough if it:
- achieves PR-AUC >= 0.34 on the held-out test set,
- maintains recall >= 0.70 at an operating precision of at least 0.25,
- and provides a defensible explanation for choosing logistic regression over a tree-based model in this setting.
Constraints
- Scores are used in a reviewer-assist workflow, so interpretability and calibration matter.
- Batch scoring runs every 15 minutes on fresh account creations; p95 inference should be < 20 ms per 1K rows.
- The model is retrained weekly and must be stable enough for policy and ops teams to monitor coefficient or feature drift.
Deliverables
- Train a regularized logistic regression baseline and a tree-based benchmark.
- Compare them on PR-AUC, ROC-AUC, recall at fixed precision, calibration, and latency.
- Explain when logistic regression should be preferred for this Meta integrity problem.
- Describe preprocessing, feature engineering, threshold selection, and validation strategy.
- Recommend a production model and justify the tradeoff between accuracy, interpretability, and operational simplicity.