Full text

Turn on search term navigation

© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Whole-genome shotgun (WGS) metagenomic sequencing of microbial communities enables the discovery of the functions, physiologies, and evolutionary histories of prokaryotic and eukaryotic microbes. However, metagenomic studies of microbial eukaryotes lag due to challenges in identifying and assembling high-quality genomes from WGS data. To address this problem, we developed Eukfinder, a bioinformatics pipeline that identifies potential eukaryotic sequences from WGS metagenomic data, with a complementary binning workflow for recovering nuclear and mitochondrial genomes. Eukfinder uses two specialized databases for read/contig classification, customizable to specific data sets or environments. We tested Eukfinder on simulated gut microbiome data sets which included varying numbers of reads from the protist Blastocystis, a human gut commensal. We also applied Eukfinder to previously published human gut microbiome WGS metagenomic data to recover new genomes of Blastocystis. Compared to other workflows, Eukfinder offers the potential to recover high-quality, near-complete genomes of diverse eukaryotes, including different Blastocystis subtypes, without relying on a reference genome. With sufficient sequencing depth, Eukfinder outperforms similar tools for recovering eukaryotic genomes from metagenomic data. Eukfinder is a valuable tool for reference-independent and cultivation-free studies of eukaryotic microbial genomes from environmental WGS metagenomic samples.

IMPORTANCE

Advancements in next-generation sequencing have made whole-genome shotgun (WGS) metagenomic sequencing an efficient method for de novo reconstruction of microbial genomes from various environments. Thousands of new prokaryotic genomes have been characterized; however, the large size and complexity of protistan genomes have hindered the use of WGS metagenomics to sample microbial eukaryotic diversity. Eukfinder enables the recovery of eukaryotic microbial genomes from environmental WGS metagenomic samples. Retrieval of high-quality protistan genomes from diverse metagenomic samples increases the number of reference genomes available. This aids future metagenomic investigations into the functions, physiologies, and evolutionary histories of eukaryotic microbes in the gut microbiome and other ecosystems.

Details

Title
Eukfinder: a pipeline to retrieve microbial eukaryote genome sequences from metagenomic data
Author
Zhao, Dandan 1   VIAFID ORCID Logo  ; Salas-Leiva, Dayana E 2   VIAFID ORCID Logo  ; Williams, Shelby K 1 ; Dunn, Katherine A 1   VIAFID ORCID Logo  ; Shao, Jason D 1 ; Roger, Andrew J 1   VIAFID ORCID Logo 

 Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada 
 Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada, Department of Biochemistry, Cambridge University, Cambridge, England, United Kingdom 
Section
Research Article
Publication year
2025
Publication date
May 2025
Publisher
American Society for Microbiology
ISSN
21612129
e-ISSN
21507511
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3260792920
Copyright
© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.