QA in distributed cloud architecture: injection-fault testing

Project name

QA in distributed cloud architecture: injection-fault testing

Project description

Clients of the sync&share system (CERNBOX) are particularly exposed to "operational failures" due to heterogeneity of hardware, OS and network environments. 

Sync&share system operates in very heterogenous network environment: from fast, reliable network inside the computing center to unreliable, high-latency ad-hoc connections such as from air-ports etc. 
Windows filesystems have substantially different semantics (e.g. locking) from Unix filesystems -- these difference affect the synchronization process 
the goal of the R&D is to analyze the environment and identify the relevant classes of failures in order to provide a reproducible framework for injecting faults at the system level for testing client-server data transmission 
examples: 
* network slowdown or packet loss 
* local disk failure 
* checksum errors 
* failed software upgrades 
the work is supported by real monitoring and logging data: failure patterns in an existing service (CERNBOX) 
the work extends on existing testing framework (smashbox) 

Required skills

The technical competencies required are the knowledg the Python language. Knowledge of JavaScript and of tools like Dropbox, OwnCloud and Unison would be an important asset.

Learning experience

Large-scale testing on a highlity non-homogeneous environemt (1,000s of concurrent clients, 10% of mobile clients (iOS and Android), Mac, Linux and Windows synch clients)

Project duration

6 months

Project area

Data Management

Contact for further details

massimo.lamanna@cern.ch

References

https://github.com/cernbox/smashbox

CERN group

CERN IT-DSS

Status

Submitted Submitted by Catharine Noble on Friday, January 15, 2016 - 11:46.
Thesis
Project started 15 Jan 2016
Project finished 15 Jan 2016
Defence date
2016-01-15