Abstract

The Fusion Science Demonstrator in the European Open Science Cloud for Research Pilot Project aimed to demonstrate that the fusion community can make use of distributed cloud resources. We developed a platform, Prominence, which enables users to transparently exploit idle cloud resources for running scientific workloads. In addition to standard HTC jobs, HPC jobs such as multi-node MPI are supported. All jobs are run in containers to ensure they will reliably run anywhere and are reproduceable. Cloud infrastructure is invisible to users, as all provisioning, including extensive failure handling, is completely automated. On-premises cloud resources can be utilised and at times of peak demand burst onto external clouds. In addition to the traditional “cloud-bursting” onto a single cloud, Prominence allows for bursting across many clouds in a hierarchical manner. Job requirements are taken into account, so jobs with special requirements, e.g. high memory or access to GPUs, are sent only to appropriate clouds. Here we describe Prominence, its architecture, the challenges of using many clouds opportunistically and report on our experiences with several fusion use cases.

Details

Title
Running HTC and HPC applications opportunistically across private, academic and public clouds
Author
Lahiff, Andrew; de Witt, Shaun; Caballer, Miguel; Giuseppe La Rocca; Stanislas Pamela; Coster, David
Section
7 - Facilities, Clouds and Containers
Publication year
2020
Publication date
2020
Publisher
EDP Sciences
ISSN
21016275
e-ISSN
2100014X
Source type
Conference Paper
Language of publication
English
ProQuest document ID
2465740711
Copyright
© 2020. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and conditions, you may use this content in accordance with the terms of the License.