Abstract

The field of computational proteomics is approaching the big data age, driven both by a continuous growth in the number of samples analysed per experiment, as well as by the growing amount of data obtained in each analytical run. In order to process these large amounts of data, it is increasingly necessary to use elastic compute resources such as Linux-based cluster environments and cloud infrastructures. Unfortunately, the vast majority of cross-platform proteomics tools are not able to operate directly on the proprietary formats generated by the diverse mass spectrometers. Here, we presented ThermoRawFileParser, an open-source, cross-platform tool that converts Thermo RAW files into open file formats such as MGF, and the HUPO-PSI standard file format mzML. To ensure the broadest possible availability, and to increase integration capabilities with popular workflow systems such as Galaxy or Nextflow, we have also built Conda and BioContainers containers around ThermoRawFileParser. In addition, we implemented a user-friendly interface (ThermoRawFileParserGUI) for those users not familiar with command-line tools. Finally, we performed a benchmark of ThermoRawFileParser and msconvert to verify that the converted mzML files contain reliable quantitative results.

Details

Title
ThermoRawFileParser: modular, scalable and cross-platform RAW file conversion
Author
Hulstaert, Niels; Sachsenberg, Timo; Walzer, Mathias; Barsnes, Harald; Martens, Lennart; Perez-Riverol, Yasset
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2019
Publication date
Apr 30, 2019
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2217183083
Copyright
© 2019. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.