Abstract

Summary

Properly and effectively managing reference datasets is an important task for many bioinformatics analyses. Refgenie is a reference asset management system that allows users to easily organize, retrieve and share such datasets. Here, we describe the integration of refgenie into the Galaxy platform. Server administrators are able to configure Galaxy to make use of reference datasets made available on a refgenie instance. In addition, a Galaxy Data Manager tool has been developed to provide a graphical interface to refgenie’s remote reference retrieval functionality. A large collection of reference datasets has also been made available using the CVMFS (CernVM File System) repository from GalaxyProject.org, with mirrors across the USA, Canada, Europe and Australia, enabling easy use outside of Galaxy.

Availability and implementation

The ability of Galaxy to use refgenie assets was added to the core Galaxy framework in version 22.01, which is available from https://github.com/galaxyproject/galaxy under the Academic Free License version 3.0. The refgenie Data Manager tool can be installed via the Galaxy ToolShed, with source code managed at https://github.com/BlankenbergLab/galaxy-tools-blankenberg/tree/main/data_managers/data_manager_refgenie_pull and released using an MIT license. Access to existing data is also available through CVMFS, with instructions at https://galaxyproject.org/admin/reference-data-repo/. No new data were generated or analyzed in support of this research.

Details

Title
Expanding the Galaxy’s reference data
Author
VijayKrishna, Nagampalli 1 ; Joshi, Jayadev 1 ; Coraor, Nate 2 ; Hillman-Jackson, Jennifer 2 ; Bouvier, Dave 2 ; van den Beek, Marius 2 ; Eguinoa, Ignacio 3 ; Coppens, Frederik 3 ; Davis, John 4 ; Stolarczyk, Michał 5 ; Sheffield, Nathan C 5 ; Gladman, Simon 6 ; Cuccuru, Gianmauro 7 ; Grüning, Björn 7 ; Soranzo, Nicola 8 ; Rasche, Helena 9 ; Langhorst, Bradley W 10 ; Bernt, Matthias 11 ; Fornika, Dan 12 ; David Anderson de Lima Morais 13 ; Barrette, Michel 13 ; Peter van Heusden 14 ; Petrillo, Mauro 15 ; Puertas-Gallardo, Antonio 15 ; Patak, Alex 15 ; Hotz, Hans-Rudolf 16 ; Blankenberg, Daniel 17 

 Genomic Medicine Institute, Cleveland Clinic, Cleveland, OH 44195, USA 
 Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA 16802, USA 
 VIB Center for Plant Systems Biology, 9052 Ghent, Belgium; Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium 
 Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA 
 Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22903, USA 
 University of Melbourne, Melbourne, VIC, Australia 
 University of Freiburg, Freiburg im Breisgau, Germany 
 Earlham Institute, Norwich Research Park, Norwich, UK 
 Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN Rotterdam, The Netherlands 
10  New England Biolabs, Ipswich, MA 01938, USA 
11  Department Computational Biology, Helmholtz Centre for Environmental Research, UFZ, 04318 Leipzig, Germany 
12  BC Centre for Disease Control Public Health Laboratory, Vancouver, BC, Canada 
13  Centre de Calcul Scientifique, Université de Sherbrooke, Sherbrooke, QC, Canada 
14  South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa 
15  European Commission, Joint Research Centre (JRC), Ispra, Italy 
16  Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland; SIB Swiss Institute of Bioinformatics, Basel, Switzerland 
17  Genomic Medicine Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA 
Publication year
2022
Publication date
Jan 2022
Publisher
Oxford University Press
e-ISSN
26350041
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3192246654
Copyright
© The Author(s) 2022. Published by Oxford University Press. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.