Content area
Full text
Introduction
As a key driver of phenotypic differences between individuals, and the main substrate upon which selection acts, characterising the processes that generate germline mutations is vital to determine the processes driving traits, diseases and species evolution. Owing to the importance of DNA fidelity in mammals, there are several hundred genes1 involved in DNA repair, spanning at least seven different repair pathways2, from mismatch repair to non-homologous end joining. Each of these different pathways is preferentially associated with the repair of a different spectrum of DNA changes, and because of the large number of genes and pathways involved, natural genetic variation across them is expected to lead to differences in the efficiency with which different mutation types are repaired between individuals. Although genetic variants with a large effect on DNA repair have been observed, most notably within families with high incidences of cancer3, these are generally rare in any given population, with the majority of genetic polymorphisms affecting DNA repair between individuals likely to have a small effect4. Nevertheless, the accumulation of multiple mutations across hundreds of repair genes over time could potentially lead to noticeable differences in the spectra of the mutations found between individuals. Supporting this idea, Harris and Pritchard5 found that different human populations preferentially carry different numbers of DNA mutations in different K-mer contexts. For example, human European populations have been observed to carry a relatively greater proportion of TCC > TTC changes, suggesting that individuals from this population have less efficiently corrected such changes over recent human history.
In addition to varying rates by sequence context, the distribution of single-nucleotide variants (SNVs) has been shown to be uneven across the genome, with directly adjacent SNVs occurring more often than expected and most often on the same DNA strand6. Although many of these neighbouring changes appear to be the result of a single mutational event in a single generation, that have been termed multi-nucleotide polymorphisms (MNPs)7, more commonly adjacent SNVs are found at different frequencies in the population, suggesting that they are the result of two mutation events at different times leading to what has been termed sequential dinucleotide polymorphisms (SDMs)6. Consistent with this, flanking heterozygosity has been shown...




