Network Cost Matrix

Project name

Integration of network and transfer metrics to optimize experiments workflows

Project description

One of the common use cases reported by the experiments is enabling network-aware tools, this is mainly driven by the need to optimize transfers and/or experiment workflows. This involves providing a uniform way to access and integrate existing measurements and the ability to define a so called “distance” metric between storage elements (and/or sites) that would integrate a range of different metrics such as link status, utilization, functional tests, occupancy, etc. and provide a cost matrix that can be used to decide on the job placement, finding closest replicas, determine closest storage where data can be uploaded, etc.

The aim of this project is to contribute to the ongoing developments in this area and develop a set of libraries and components that would compute the cost matrix using different algorithms and based on different set of network inputs.

 

Required skills

Machine Learning algorithms, ElasticSearch, Spark/Hadoop, ML in Spark

Learning experience

The student will acquire practical experience in data aggregation and time series predictions and will get hands-on experience with very rich datasets such as network latencies, paths and throughput

Project duration

12 months

Project area

Data Analytics Monitoring of the distributed infrastructure

Contact for further details

Marian.Babik@cern.ch

References

https://twiki.cern.ch/twiki/bin/view/LCG/NetworkTransferMetrics

CERN group

IT/CM

Status

Ongoing Submitted by mbabik on Tuesday, March 29, 2016 - 16:45.
Student info
Student name

Hendrik Borras

University

University of Heidelberg

CERN supervisor

Marian Babik

Thesis
Thesis type
Bachelor
Project started 18 Sep 2016
Defence status
not scheduled yet