Full Text

Turn on search term navigation

© 2023 Roux et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The extraordinary diversity of viruses infecting bacteria and archaea is now primarily studied through metagenomics. While metagenomes enable high-throughput exploration of the viral sequence space, metagenome-derived sequences lack key information compared to isolated viruses, in particular host association. Different computational approaches are available to predict the host(s) of uncultivated viruses based on their genome sequences, but thus far individual approaches are limited either in precision or in recall, i.e., for a number of viruses they yield erroneous predictions or no prediction at all. Here, we describe iPHoP, a two-step framework that integrates multiple methods to reliably predict host taxonomy at the genus rank for a broad range of viruses infecting bacteria and archaea, while retaining a low false discovery rate. Based on a large dataset of metagenome-derived virus genomes from the IMG/VR database, we illustrate how iPHoP can provide extensive host prediction and guide further characterization of uncultivated viruses.

Details

Title
iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria
Author
Simon Roux https://orcid.org/0000-0002-5831-5895; Antonio Pedro Camargo; Coutinho, Felipe H; Dabdoub, Shareef M; Dutilh, Bas E; Nayfach, Stephen; Tritt, Andrew
First page
e3002083
Section
Methods and Resources
Publication year
2023
Publication date
Apr 2023
Publisher
Public Library of Science
ISSN
15449173
e-ISSN
15457885
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2814430395
Copyright
© 2023 Roux et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.