Abstract

Application of assembly methods for personal genome analysis from next generation sequencing data has been limited by the requirement for an expensive supercomputer hardware or long computation times when using ordinary resources. We describe CompStor Novos, achieving supercomputer-class performance in de novo assembly computation time on standard server hardware, based on a tiered-memory algorithm. Run on commercial off-the-shelf servers, Novos assembly is more precise and 10-20 times faster than that of existing assembly algorithms. Furthermore, we integrated Novos into a variant calling pipeline and demonstrate that both compute times and precision of calling point variants and indels compare well with standard alignment-based pipelines. Additionally, assembly eliminates bias in the estimation of allele frequency for indels and naturally enables discovery of breakpoints for structural variants with base pair resolution. Thus, Novos bridges the gap between alignment-based and assembly-based genome analyses. Extension and adaption of its underlying algorithm will help quickly and fully harvest information in sequencing reads for personal genome reconstruction.

Details

Title
CompStor Novos: a low cost yet fast assembly-based variant calling for personal genomes
Author
Oenning, Travis; Bae, Taejeong; Iyengar, Aravind; Brickner, Barrett; Soysa, Madushanka; Wright, Nicholas; Kumar, Prasanth; Indupuru, Suneel; Abyzov, Alexej; Coker, Jonathan
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2018
Publication date
Dec 4, 2018
Publisher
Cold Spring Harbor Laboratory Press
Source type
Working Paper
Language of publication
English
ProQuest document ID
2154257805
Copyright
© 2018. This article is published under http://creativecommons.org/licenses/by-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.