Full Text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The generation of different types of defective viral genomes (DVG) is an unavoidable consequence of the error-prone replication of RNA viruses. In recent years, a particular class of DVGs, those containing long deletions or genome rearrangements, has gain interest due to their potential therapeutic and biotechnological applications. Identifying such DVGs in high-throughput sequencing (HTS) data has become an interesting computational problem. Several algorithms have been proposed to accomplish this goal, though all incur false positives, a problem of practical interest if such DVGs have to be synthetized and tested in the laboratory. We present a metasearch tool, DVGfinder, that wraps the two most commonly used DVG search algorithms in a single workflow for the identification of the DVGs in HTS data. DVGfinder processes the results of ViReMa-a and DI-tector and uses a gradient boosting classifier machine learning algorithm to reduce the number of false-positive events. The program also generates output files in user-friendly HTML format, which can help users to explore the DVGs identified in the sample. We evaluated the performance of DVGfinder compared to the two search algorithms used separately and found that it slightly improves sensitivities for low-coverage synthetic HTS data and DI-tector precision for high-coverage samples. The metasearch program also showed higher sensitivity on a real sample for which a set of copy-backs were previously validated.

Details

Title
DVGfinder: A Metasearch Tool for Identifying Defective Viral Genomes in RNA-Seq Data
Author
Olmo-Uceda, Maria J 1   VIAFID ORCID Logo  ; Muñoz-Sánchez, Juan C 1   VIAFID ORCID Logo  ; Lasso-Giraldo, Wilberth 1   VIAFID ORCID Logo  ; Arnau, Vicente 1   VIAFID ORCID Logo  ; Díaz-Villanueva, Wladimiro 1   VIAFID ORCID Logo  ; Elena, Santiago F 2   VIAFID ORCID Logo 

 Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, 46980 Valencia, Spain; [email protected] (M.J.O.-U.); [email protected] (J.C.M.-S.); [email protected] (W.L.-G.); [email protected] (V.A.); [email protected] (W.D.-V.) 
 Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, 46980 Valencia, Spain; [email protected] (M.J.O.-U.); [email protected] (J.C.M.-S.); [email protected] (W.L.-G.); [email protected] (V.A.); [email protected] (W.D.-V.); Santa Fe Institute, Santa Fe, NM 87501, USA 
First page
1114
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
19994915
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2670479265
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.