Using data analytics for WLCG data transfer optimization

Project name

Using data analytics for WLCG data transfer optimization

Project description

The overall success of LHC data processing depends heavily on the stable, reliable and fast data distribution performed by the WLCG File Transfer Service (FTS). FTS transfers around 15 PB of data each month representing millions of files per day. The efficient functioning of this service is crucial for successful exploitation of the LHC data. The large scale of the transfer activity and the shared nature of the LHC computing
infrastructure, which is used by several virtual organizations, create a challenge for the FTS service.
The project proposes the exploration of the FTS historical monitoring data with the aim of improving the service efficiency. Data analysis should consider all kinds of transfer routes, protocols, and experiments’ data transfer workflows with various FTS configurations. The goal of the project is to assist the FTS3 infrastructure to sustain higher traffic while optimizing the resource usage and reducing data transfer latencies. This includes creating a data analytics platform for the FTS performance analysis and predictions.

Required skills

Some experience with Python , SQL and basic knowledge of the TCP/IP protocol would be an advantage

Learning experience

The project offers an opportunity to contribute to the evolution of the WLCG data transfer service by taking part in the design

Project duration

9 months

Project area

Data Management

Contact for further details

oliver.keeble@cern.ch

References

CERN group

IT-SDC

Status

Submitted

Reference to the project tracker

You are here