StreamSafe uses a binary classifier to score user-generated posts for policy risk and route the highest-risk content to a human safety queue. The current threshold was set six months ago, but policy operations now report that too many harmful posts are slipping through while reviewers are also near capacity.
Validation set size: 200,000 posts. Harmful content prevalence: 2.5% (5,000 posts).
| Threshold | Precision | Recall | F1 | FPR | Posts Routed/Day | Harmful Posts Caught/Day |
|---|---|---|---|---|---|---|
| 0.80 | 0.91 | 0.42 | 0.57 | 0.2% | 1,150 | 525 |
| 0.65 | 0.78 | 0.61 | 0.68 | 0.5% | 2,050 | 763 |
| 0.50 (current) | 0.64 | 0.76 | 0.69 | 1.1% | 3,400 | 950 |
| 0.35 | 0.46 | 0.88 | 0.60 | 2.8% | 6,100 | 1,100 |
| 0.20 | 0.29 | 0.95 | 0.44 | 6.4% | 10,900 | 1,188 |
Leadership wants a threshold recommendation for routing high-risk content. Missing truly harmful content has regulatory and brand risk, but false positives consume reviewer bandwidth and delay benign posts.