Abstract

Metagenomic binning is an essential technique for genome-resolved characterization of uncultured microorganisms in various ecosystems but hampered by the low efficiency of binning tools in adequately recovering metagenome-assembled genomes (MAGs). Here, we introduce BASALT (Binning Across a Series of Assemblies Toolkit) for binning and refinement of short- and long-read sequencing data. BASALT employs multiple binners with multiple thresholds to produce initial bins, then utilizes neural networks to identify core sequences to remove redundant bins and refine non-redundant bins. Using the same assemblies generated from Critical Assessment of Metagenome Interpretation (CAMI) datasets, BASALT produces up to twice as many MAGs as VAMB, DASTool, or metaWRAP. Processing assemblies from a lake sediment dataset, BASALT produces ~30% more MAGs than metaWRAP, including 21 unique class-level prokaryotic lineages. Functional annotations reveal that BASALT can retrieve 47.6% more non-redundant opening-reading frames than metaWRAP. These results highlight the robust handling of metagenomic sequencing data of BASALT.

Binning is an essential step in genome-resolved metagenomic analysis in which assembled contigs originating from the same source population are clustered. However it is challenging, especially for low abundance microbial species. Here the authors introduce a toolkit that integrates multiple prominent binning tools and AI for efficient and high-resolution recovery of non-redundant bins from short- and long-read metagenomic sequencing datasets.

Details

Title
BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis
Author
Qiu, Zhiguang 1   VIAFID ORCID Logo  ; Yuan, Li 2 ; Lian, Chun-Ang 1   VIAFID ORCID Logo  ; Lin, Bin 3 ; Chen, Jie 2 ; Mu, Rong 4 ; Qiao, Xuejiao 4 ; Zhang, Liyu 4 ; Xu, Zheng 5 ; Fan, Lu 6   VIAFID ORCID Logo  ; Zhang, Yunzeng 7 ; Wang, Shanquan 8   VIAFID ORCID Logo  ; Li, Junyi 9   VIAFID ORCID Logo  ; Cao, Huiluo 10 ; Li, Bing 11   VIAFID ORCID Logo  ; Chen, Baowei 12 ; Song, Chi 13   VIAFID ORCID Logo  ; Liu, Yongxin 14   VIAFID ORCID Logo  ; Shi, Lili 15 ; Tian, Yonghong 2   VIAFID ORCID Logo  ; Ni, Jinren 16   VIAFID ORCID Logo  ; Zhang, Tong 17   VIAFID ORCID Logo  ; Zhou, Jizhong 18   VIAFID ORCID Logo  ; Zhuang, Wei-Qin 19 ; Yu, Ke 1   VIAFID ORCID Logo 

 Peking University, Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319); Peking University, AI for Science (AI4S)-Preferred Program, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319) 
 Peking University, AI for Science (AI4S)-Preferred Program, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319); Peking University, School of Electronic and Computer Engineering, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319); Peng Cheng Laboratory, Shenzhen, China (GRID:grid.508161.b) (ISNI:0000 0005 0389 1328) 
 Peking University, School of Electronic and Computer Engineering, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319) 
 Peking University, Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319) 
 Southern University of Sciences and Technology Yantian Hospital, Shenzhen, China (GRID:grid.263817.9) (ISNI:0000 0004 1773 1790); Chinese Academy of Sciences, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Shenzhen, China (GRID:grid.9227.e) (ISNI:0000000119573309) 
 Southern University of Science and Technology (SUSTech), Department of Ocean Science and Engineering, Shenzhen, China (GRID:grid.263817.9) (ISNI:0000 0004 1773 1790) 
 Yangzhou University, Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou, China (GRID:grid.268415.c) 
 Sun Yat-Sen University, Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Guangzhou, China (GRID:grid.12981.33) (ISNI:0000 0001 2360 039X) 
 Harbin Institute of Technology (Shenzhen), School of Computer Science and Technology, Shenzhen, China (GRID:grid.19373.3f) (ISNI:0000 0001 0193 3564) 
10  University of Hong Kong, Department of Microbiology, Hong Kong, China (GRID:grid.194645.b) (ISNI:0000 0001 2174 2757) 
11  Tsinghua University, Shenzhen International Graduate School, Shenzhen, China (GRID:grid.12527.33) (ISNI:0000 0001 0662 3178) 
12  Sun Yat-sen University, Guangdong Provincial Key Laboratory of Marine Resources and Coastal Engineering, School of Marine Sciences, Zhuhai, China (GRID:grid.12981.33) (ISNI:0000 0001 2360 039X) 
13  Chengdu University of Traditional Chinese Medicine, Institute of Herbgenomics, Chengdu, China (GRID:grid.411304.3) (ISNI:0000 0001 0376 205X); Ltd, Wuhan Benagen Technology Co., Wuhan, China (GRID:grid.411304.3) 
14  Chinese Academy of Agricultural Sciences, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Shenzhen, China (GRID:grid.410727.7) (ISNI:0000 0001 0526 1937) 
15  Peking University, AI for Science (AI4S)-Preferred Program, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319); Peking University Shenzhen Graduate School, State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319) 
16  Peking University, Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Shenzhen, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319); Peking University, College of Environmental Sciences and Engineering, Key Laboratory of Water and Sediment Sciences, Ministry of Education, Beijing, China (GRID:grid.11135.37) (ISNI:0000 0001 2256 9319) 
17  University of Hong Kong, Department of Civil Engineering, Hong Kong, China (GRID:grid.194645.b) (ISNI:0000 0001 2174 2757) 
18  University of Oklahoma, Institute for Environmental Genomics, Norman, USA (GRID:grid.266900.b) (ISNI:0000 0004 0447 0018) 
19  University of Auckland, Department of Civil and Environmental Engineering, Faculty of Engineering, Auckland, New Zealand (GRID:grid.9654.e) (ISNI:0000 0004 0372 3343) 
Pages
2179
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2955124437
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.