Content area
Abstract
Ancient DNA obtained from ancient samples, such as sediments, bones, and teeth, is an important genetic resource that can be used to reconstruct an evolutional history of humans, animals, and plants. The application of high-throughput sequencing enables the research of ancient DNA to be conducted in a whole genome scale. However, post-mortem DNA damage mainly caused by deamination of cytosine to uracil (or methylated cytosine to thymine) may confound the variant calling and downstream analysis. In this article, we develop a Python program to implement a new variant caller, “AntCaller”, which extracts the information on nucleotide substitutions from sequencing data and calculates the probability of each genotype based on a Bayesian rule. Through both simulation studies and real data analyses, it was shown that our method reduced the false discovery rate caused by nucleotide misincorporations and outperformed two mainstream variant callers (i.e., GATK and SAMtools) in terms of calling accuracy. In a real application with serious DNA damage, AntCaller still outperformed GATK and SAMtools combined with quality score recalling.
Details
1 State Key Laboratory of Genetic Engineering and Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
2 State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai, People’s Republic of China





