Content area

Abstract

Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.

Long-read metagenomics offers a valuable approach for profiling bacterial communities. This work presents a long-read assembler, metaFlye, that specifically addresses the challenges of assembling metagenomes.

Details

Title
metaFlye: scalable long-read metagenome assembly using repeat graphs
Author
Kolmogorov Mikhail 1   VIAFID ORCID Logo  ; Bickhart, Derek M 2 ; Behsaz Bahar 3 ; Gurevich Alexey 4 ; Rayko Mikhail 4   VIAFID ORCID Logo  ; Shin, Sung Bong 5 ; Kuhn, Kristen 5 ; Yuan, Jeffrey 3   VIAFID ORCID Logo  ; Polevikov Evgeny 6 ; Smith Timothy P L 5   VIAFID ORCID Logo  ; Pevzner, Pavel A 7   VIAFID ORCID Logo 

 University of California, Department of Computer Science and Engineering, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 Dairy Forage Research Center, USDA, Cell Wall Biology and Utilization Laboratory, Madison, USA (GRID:grid.417548.b) (ISNI:0000 0004 0478 6311) 
 University of California, Graduate Program in Bioinformatics and System Biology, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 St. Petersburg State University, Center for Algorithmic Biotechnology, St. Petersburg, Russia (GRID:grid.15447.33) (ISNI:0000 0001 2289 6897) 
 USDA-ARS US Meat Animal Research Center, Clay Center, USA (GRID:grid.463419.d) (ISNI:0000 0001 0946 3608) 
 St. Petersburg State University, Center for Algorithmic Biotechnology, St. Petersburg, Russia (GRID:grid.15447.33) (ISNI:0000 0001 2289 6897); Bioinformatics Institute, St. Petersburg, Russia (GRID:grid.15447.33) 
 University of California, Department of Computer Science and Engineering, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California, Center for Microbiome Innovation, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
Pages
1103-1110
Publication year
2020
Publication date
Nov 2020
Publisher
Nature Publishing Group
ISSN
15487091
e-ISSN
15487105
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2471540821
Copyright
© The Author(s), under exclusive licence to Springer Nature America, Inc. 2020.