Abstract

Distinct HEP workflows have distinct I/O needs; while ROOT I/O excels at serializing complex C++ objects common to reconstruction, analysis workflows typically have simpler objects and can sustain higher event rates. To meet these workflows, we have developed a “bulk I/O” interface, allowing multiple events’ data to be returned per library call. This reduces ROOT-related overheads and increases event rates – orders-of-magnitude improvements are shown in microbenchmarks.

Unfortunately, this bulk interface is difficult to use as it requires users to identify when it is applicable and they still “think” in terms of events, not arrays of data. We have integrated the bulk I/O interface into the new RDataFrame analysis framework inside ROOT. As RDataFrame’s interface can provide improved type information, the framework itself can determine what data is readable via the bulk IO and automatically switch between interfaces. We demonstrate how this can improve event rates when reading analysis data formats, such as CMS’s NanoAOD.

Details

Title
Speeding HEP Analysis with ROOT Bulk I/O
Author
Bockelman, B 1 ; Zhang, Z 2 ; Shadura, O 2 

 Morgridge Institute for Research, Madison, WI 53715, USA 
 Holland Computer Center, University Nebraska - Lincoln, Lincoln, NE 68588, USA 
Publication year
2020
Publication date
Apr 2020
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2557255430
Copyright
© 2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.