Network Cost Matrix

Project name

Integration of network and transfer metrics to optimize experiments workflows

Project description

One of the common use cases reported by the experiments is enabling network-aware tools, this is mainly driven by the need to optimize transfers and/or experiment workflows. This involves providing a uniform way to access and integrate existing measurements and the ability to define a so called “distance” metric between storage elements (and/or sites) that would integrate a range of different metrics such as link status, utilization, functional tests, occupancy, etc. and provide a cost matrix that can be used to decide on the job placement, finding closest replicas, determine closest storage where data can be uploaded, etc.

The aim of this project is to contribute to the ongoing developments in this area and develop a set of libraries and components that would compute the cost matrix using different algorithms and based on different set of network inputs.

 

Required skills

Machine Learning algorithms, ElasticSearch, Spark/Hadoop, ML in Spark

Learning experience

The student will acquire practical experience in data aggregation and time series predictions and will get hands-on experience with very rich datasets such as network latencies, paths and throughput

Project duration

12 months

Project area

Data Analytics
Monitoring of the distributed infrastructure

Contact for further details

Marian.Babik@cern.ch

CERN group

IT/CM

Status

Ongoing
Student Information
Student name: 
Hendrik Borras
University: 
University of Heidelberg
CERN supervisor: 
Marian Babik
Thesis type: 
Bachelor
Project started: 
18 Sep 2016
Defence status: 
not scheduled yet

Reference to the project tracker

You are here