Abstract

We present a model for the operation of computing nodes at a site using Virtual Machines (VMs), in which VMs are created and contextualized for experiments by the site itself. For the experiment, these VMs appear to be produced spontaneously "in the vacuum" rather having to ask the site to create each one. This model takes advantage of the existing pilot job frameworks adopted by many experiments. In the Vacuum model, the contextualization process starts a job agent within the VM and real jobs are fetched from the central task queue as normal. An implementation of the Vacuum scheme, Vac, is presented in which a VM factory runs on each physical worker node to create and contextualize its set of VMs. With this system, each node's VM factory can decide which experiments' VMs to run, based on site-wide target shares and on a peer-to-peer protocol in which the site's VM factories query each other to discover which VM types they are running. A property of this system is that there is no gate keeper service, head node, or batch system accepting and then directing jobs to particular worker nodes, avoiding several central points of failure. Finally, we describe tests of the Vac system using jobs from the central LHCb task queue, using the same contextualization procedure for VMs developed by LHCb for Clouds.

Details

Title
Running Jobs in the Vacuum
Author
McNab, A 1 ; Stagni, F 2 ; M Ubeda Garcia 2 

 School of Physics and Astronomy, University of Manchester, UK 
 CERN, Switzerland 
Publication year
2014
Publication date
Jun 2014
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2576665331
Copyright
© 2014. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.