Performance optimization in a High Throughput Computing environment

Project name

Performance optimization in a High Throughput Computing environment

Project description

Profiling of computing resources respect to WLCG experiment workloads is a crucial factor to select the most effective resources and to be able to optimise their usage.
There is a rich amount of data collected by the CERN and WLCG monitoring infrastructures just waiting to be turned into useful information. This data covers all the areas of the computing activity such as (real and/or virtual) machine monitoring, storage, network, batch system performance, experiment job monitoring.
Data gathered by those systems contain great intrinsic value, however information needs to be extracted and understood through a predictive data analytics process. The final purpose of this process is to support decisions and improve the efficiency and the reliability of the related services.
For instance, with the adoption of the remote access of data it becomes mandatory to understand the impact of this approach to the job efficiency. Here the interplay of network and CPU effects, as well as the resource usage from multi VOs needs to be studied and understood. An interesting topic of study is the performance of job processing at the WLCG distributed T0 center, which is physically split between Computer Centers in Meyrin and Wigner. The goal of the project will be to understand the difference in the performance and to suggest possible optimization.

The work will be conducted in close contact with the experts (CERN analytics working group, system managers, developers) and will provide a deep insight into the computing infrastructure of a WLCG datacenter, its design, technical requirements and operational challenges.

Required skills

Python, matplotlib. Some experience in data analysis and statistics would be an advantage.

Learning experience

Using analytics approaches already consolidated in other scientific domains, such as physics and finance, the candidate will learn and adopt techniques for data mining (trend analysis, result visualization, forecasting and predictive modeling) using cutting edge tools such as the analytics python ecosystem (IPython, numpy, matplotlib, scipy, pandas, scikit-learn, etc).

Project duration

6 to 12 months

Project area

Data Analytics

Contact for further details

julia.andreeva@cern.ch

CERN group

IT-SDC

Status

Submitted Submitted by markusw on Friday, January 15, 2016 - 11:51.