Network Cost Matrix
Project name
Integration of network and transfer metrics to optimize experiments workflowsProject description
One of the common use cases reported by the experiments is enabling network-aware tools, this is mainly driven by the need to optimize transfers and/or experiment workflows. This involves providing a uniform way to access and integrate existing measurements and the ability to define a so called “distance” metric between storage elements (and/or sites) that would integrate a range of different metrics such as link status, utilization, functional tests, occupancy, etc. and provide a cost matrix that can be used to decide on the job placement, finding closest replicas, determine closest storage where data can be uploaded, etc.
The aim of this project is to contribute to the ongoing developments in this area and develop a set of libraries and components that would compute the cost matrix using different algorithms and based on different set of network inputs.
Required skills
Machine Learning algorithms, ElasticSearch, Spark/Hadoop, ML in SparkLearning experience
The student will acquire practical experience in data aggregation and time series predictions and will get hands-on experience with very rich datasets such as network latencies, paths and throughputProject duration
12 monthsProject area
Data Analytics Monitoring of the distributed infrastructureContact for further details
Marian.Babik@cern.chReferences
https://twiki.cern.ch/twiki/bin/view/LCG/NetworkTransferMetrics
CERN group
IT/CMStatus
Ongoing Submitted by mbabik on Tuesday, March 29, 2016 - 16:45.Hendrik Borras
University of Heidelberg
Marian Babik