Identifying and correcting repeat-calling errors

Abstract

Nanopore long-read sequencing is an emerging approach for studying genomes, including long repetitive elements like telomeres. Here, we report extensive basecalling induced errors at telomere repeats across nanopore datasets, sequencing platforms, basecallers, and basecalling models. We find that telomeres in many organisms are frequently miscalled. We demonstrate that tuning of nanopore basecalling models leads to improved recovery and analysis of telomeric regions, with minimal negative impact on other genomic regions. We highlight the importance of verifying nanopore basecalls in long, repetitive, and poorly defined regions, and showcase how artefacts can be resolved by improvements in nanopore basecalling models.

Details

Title

Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres

Author

Kar-Tong, Tan; Slevin, Michael K; Meyerson, Matthew; Li, Heng

Pages

1-16

Section

Short Report

Publication year

2022

Publication date

2022

Publisher

BioMed Central

ISSN

14747596

e-ISSN

1474760X

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s13059-022-02751-6

ProQuest document ID

2715493962

© 2022. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres

Jump to:

Abstract

Details

Suggested sources