Advanced Notifications for Network Incidents

Project name

Advanced Notifications for Network Incidents (ANNI)

Project description

One of the main challenges in LHCOPN/LHCONE networking is the network diagnostics and advanced notifications on the issues seen in the network. Currently, most of the issues are only visible by the applications and need to be debugged after the incident and performance degradation has already occurred. This is primarily due to the underlying complexity of the WLCG network (multi-domain) and difficulty to understand state of the network and how it changes over time. This project will aim to use the current open-source event processing systems (such as Spark/Hadoop) to automate detection and location of the network problems using the existing streams. The project will be done in collaboration with the NSF-funded PUNDIT.

The project will build on the standard WLCG perfSONAR network measurement infrastructure and will aim to gather and analyze complex real-world network topologies and their corresponding network metrics to identify possible signatures of the network problems. It will provide a real-time view on the existing diagnosed issues together with a list of existing downtimes from the network providers to the experiments operations teams.

 

Required skills

TCP/IP networking, Java/Scala experience (Spark/Hadoop)

Learning experience

The student will acquire practical experience in design and development of the advanced notification platform based on network latencies, paths and throughputs

Project duration

12 months

Project area

Monitoring of the distributed infrastructure

Contact for further details

Marian.Babik@cern.ch

CERN group

IT/CM

Status

Submitted

Reference to the project tracker

You are here