Abstract

AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day of various sizes, spread worldwide across more than 60 sites. It must handle up to 1000 individual users per month, and work with minimal delay. This creates challenging requirements for system scalability, performance and monitoring. ASO uses FTS to schedule and execute the transfers between the storage elements of the source and destination sites. It has evolved from a limited prototype to a highly adaptable service, which manages and monitors the user file placement and bookkeeping. To ensure system scalability and data monitoring, it employs new technologies such as a NoSQL database and re-uses existing components of PhEDEx and the FTS Dashboard. We present the asynchronous stage-out strategy and the architecture of the solution we implemented to deal with those issues and challenges. The deployment model for the high availability and scalability of the service is discussed. The performance of the system during the commissioning and the first phase of production are also shown, along with results from simulations designed to explore the limits of scalability.

Details

Title
AsyncStageOut: Distributed user data management for CMS Analysis
Author
Riahi, H 1 ; Wildish, T 2 ; Ciangottini, D 3 ; Hernández, J M 4 ; Andreeva, J 1 ; Balcas, J 5 ; Karavakis, E 1 ; Mascheroni, M 6 ; Tanasijczuk, A J 7 ; Vaandering, E W 8 

 European Organization for Nuclear Research, IT Department, CH-1211 Geneva 23, Switzerland 
 Princeton University, Princeton, NJ 08544 USA 
 Universitá and INFN Perugia, Via Alessandro Pascoli, 06123 Perugia, Italy 
 CIEMAT, Madrid 28040, Spain 
 DiSCC, Vilnius University, LT-01513 Vilnius, Lithuania 
 INFN Milano-Bicocca, Piazza della Scienza, 3 - I-20126 Milano, Italy 
 University of California, San Diego, La Jolla, CA 92093-0354, USA 
 Fermi National Laboratory, Batavia, IL 60510, USA 
Publication year
2015
Publication date
Dec 2015
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2576533437
Copyright
© 2015. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.