Business Context
NetSecureOps manages 18,000 FortiGate firewall policies across enterprise customers. The operations team wants an ML model that classifies whether a policy is configured for flow-based or proxy-based inspection so they can audit misconfigurations, estimate performance impact, and prioritize policy reviews.
Dataset
You are given a historical configuration dataset extracted from FortiGate devices and enriched with policy metadata.
| Feature Group | Count | Examples |
|---|
| Policy settings | 14 | utm_status, ssl_inspection_profile, av_profile, webfilter_profile, ips_sensor |
| Traffic metadata | 8 | protocol_mix_tcp_pct, avg_session_duration_ms, avg_packet_size, app_control_enabled |
| Device/context | 6 | fortios_version, hardware_model, vdom_count, policy_type |
| Operational signals | 5 | cpu_utilization_pct, memory_utilization_pct, sessions_per_sec, latency_ms |
- Size: 92K firewall policies, 33 features
- Target: Binary label —
inspection_mode = flow-based or proxy-based
- Class balance: 61% flow-based, 39% proxy-based
- Missing data: 9% missing in operational metrics, 4% missing in profile-related fields for legacy devices
Success Criteria
A good solution should achieve:
- F1 score >= 0.88 on the holdout set
- Recall >= 0.90 for proxy-based policies, since missing those is more costly for audit workflows
- Clear feature importance so network engineers can understand which configuration attributes drive predictions
Constraints
- Batch scoring must finish in under 5 minutes for 20K policies
- Predictions should be interpretable enough for compliance and network operations teams
- Retraining should be lightweight and feasible after monthly configuration exports
Deliverables
- Build a binary classification pipeline for inspection mode prediction
- Explain the difference between flow-based and proxy-based inspection through the features and labels used in the model
- Handle categorical, numerical, and missing data appropriately
- Evaluate the model with class-specific metrics, not accuracy alone
- Recommend how the model would be deployed for monthly policy audits