OptiWave Networks operates a national DWDM transport backbone with 9,000+ optical spans and hundreds of ROADMs. The network operations team wants a model that predicts whether a wavelength service will experience a photonic-layer incident in the next 24 hours so they can prioritize preventive maintenance and rerouting.
You are given telemetry aggregated at the channel-hour level from DWDM shelves, ROADMs, amplifiers, and photonic controllers.
| Feature Group | Count | Examples |
|---|---|---|
| Optical power and quality | 18 | tx_power_dbm, rx_power_dbm, osnr_db, q_margin_db, pre_fec_ber, post_fec_ber |
| ROADM path/configuration | 11 | hop_count, degree_count, add_drop_count, route_type, cdc_flex_enabled |
| Equipment and topology | 9 | vendor, amplifier_count, span_length_km, fiber_type, channel_frequency_ghz |
| Temporal and operational | 10 | hour_of_day, day_of_week, maintenance_window_flag, config_change_count_24h |
| Alarm history | 6 | los_alarm_count_24h, power_excursion_count_24h, amplifier_reset_count_7d |
A good solution should achieve recall >= 0.75 on incidents while maintaining precision >= 0.35 for the alert queue. The model should also provide feature-level explanations that help operations engineers understand likely root causes.