Abstract

DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy remains largely unknown. Here, we present a systematic benchmarking of tools for the detection of CpG methylation from Nanopore sequencing using individual reads, control mixtures of methylated and unmethylated reads, and bisulfite sequencing. We found that tools have a tradeoff between false positives and false negatives and present a high dispersion with respect to the expected methylation frequency values. We described various strategies to improve the accuracy of these tools, including a consensus approach, METEORE (https://github.com/comprna/METEORE), based on the combination of the predictions from two or more tools that shows improved accuracy over individual tools. Snakemake pipelines are also provided for reproducibility and to enable the systematic application of our analyses to other datasets.

Several existing algorithms predict the methylation of DNA using Nanopore sequencing signals, but it is unclear how they compare in performance. Here, the authors benchmark the performance of several such tools, and propose METEORE, a consensus tool that improves prediction accuracy.

Details

Title
Systematic benchmarking of tools for CpG methylation detection from nanopore sequencing
Author
Yuen Zaka Wing-Sze 1   VIAFID ORCID Logo  ; Srivastava Akanksha 1   VIAFID ORCID Logo  ; Runa, Daniel 2 ; McNevin, Dennis 3   VIAFID ORCID Logo  ; Cameron, Jack 4   VIAFID ORCID Logo  ; Eyras Eduardo 5   VIAFID ORCID Logo 

 EMBL Australia Partner Laboratory Network, Australian National University, Canberra, Australia (GRID:grid.1001.0) (ISNI:0000 0001 2180 7477); Australian National University, The John Curtin School of Medical Research, Canberra, Australia (GRID:grid.1001.0) (ISNI:0000 0001 2180 7477) 
 Victoria Police Forensic Services Department, Office of the Chief Forensic Scientist, Macleod, Australia (GRID:grid.474235.3) 
 University of Technology Sydney, Centre for Forensic Science, School of Mathematical & Physical Sciences (MaPS), Faculty of Science, Sydney, Australia (GRID:grid.117476.2) (ISNI:0000 0004 1936 7611) 
 Australian National University, The John Curtin School of Medical Research, Canberra, Australia (GRID:grid.1001.0) (ISNI:0000 0001 2180 7477) 
 EMBL Australia Partner Laboratory Network, Australian National University, Canberra, Australia (GRID:grid.1001.0) (ISNI:0000 0001 2180 7477); Australian National University, The John Curtin School of Medical Research, Canberra, Australia (GRID:grid.1001.0) (ISNI:0000 0001 2180 7477); Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain (GRID:grid.425902.8) (ISNI:0000 0000 9601 989X); Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain (GRID:grid.20522.37) (ISNI:0000 0004 1767 9005) 
Publication year
2021
Publication date
2021
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2538877424
Copyright
© The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.