Abstract

Most sequencing data analyses start by aligning sequencing reads to a linear reference genome, but failure to account for genetic variation leads to reference bias and confounding of results downstream. Other approaches replace the linear reference with structures like graphs that can include genetic variation, incurring major computational overhead. We propose the reference flow alignment method that uses multiple population reference genomes to improve alignment accuracy and reduce reference bias. Compared to the graph aligner vg, reference flow achieves a similar level of accuracy and bias avoidance but with 14% of the memory footprint and 5.5 times the speed.

Details

Title
Reference flow: reducing reference bias using multiple population genomes
Author
Chen, Nae-Chyun; Solomon, Brad; Taher Mun; Iyer, Sheila; Langmead, Ben  VIAFID ORCID Logo 
Pages
1-17
Section
Software
Publication year
2021
Publication date
2021
Publisher
BioMed Central
ISSN
14747596
e-ISSN
1474760X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2478809884
Copyright
© 2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.