ARTICLE
Received 20 Sep 2012 | Accepted 27 Feb 2013 | Published 23 Apr 2013
Paul Brotherton1,2,*, Wolfgang Haak1,*, Jennifer Templeton1, Guido Brandt3, Julien Soubrier1,Christina Jane Adler1,w, Stephen M. Richards1, Clio Der Sarkissian1,w, Robert Ganslmeier4, Susanne Friederich4, Veit Dresely4, Mannis van Oven5, Rosalie Kenyon6, Mark B. Van der Hoek6, Jonas Korlach7, Khai Luong7, Simon Y.W. Ho8, Lluis Quintana-Murci9, Doron M. Behar10, Harald Meller4, Kurt W. Alt3, Alan Cooper1& The Genographic Consortiumz
Haplogroup H dominates present-day Western European mitochondrial DNA variability (440%), yet was less common (B19%) among Early Neolithic farmers (B5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this real-time genetic data with cultural changes taking place between the Early Neolithic (B5450 BC) and
Bronze Age (B2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (B4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the
Bell Beakers expanding out of Iberia in the Late Neolithic (B2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.
1 The Australian Centre for Ancient DNA, School of Earth and Environmental Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia.
2 Archaeogenetics Research Group, School of Applied Sciences, University of Hudderseld, Hudderseld HD1 3DH, UK. 3 Institute of Anthropology, Colonel-Kleinmann Weg 2, Johannes Gutenberg University Mainz, D-55128 Mainz, Germany. 4 State Ofce for Heritage Management and Archaeology Saxony-Anhalt/State Museum for Prehistory Halle, Richard-Wagner-Stra 9, D-06114 Halle/Saale, Germany. 5 Department of Forensic Molecular Biology, Erasmus
MC, University Medical Centre Rotterdam, 3000 CA Rotterdam, The Netherlands. 6 SA Pathology, SA Health, Adelaide, South Australia 5000, Australia.
7 Pacic Biosciences, Menlo Park, California 94025, USA. 8 School of Biological Sciences, The University of Sydney, Sydney, New South Wales 2006, Australia. 9 Institut Pasteur, Unit of Evolutionary Genetics, 75015 Paris, France. 10 Rambam Medical Centre, 31096 Haifa, Israel. * These authors contributed equally to this work. w Present address: Institute of Dental Research, Westmead Centre for Oral Health, The University of Sydney, Sydney, New South Wales 2145, Australia (C.J.A.); Centre for Geogenetics, Natural History Museum of Denmark, 1350 Copenhagen, Denmark (C.D.S.). Correspondence and requests for materials should be addressed to P.B. (email: mailto:[email protected]
Web End [email protected] ) or to W.H. (email: mailto:[email protected]
Web End [email protected] ).
zA full list of authors for the Genographic Consortium and their afliations appears at the end of the paper.
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 1
& 2013 Macmillan Publishers Limited. All rights reserved.
DOI: 10.1038/ncomms2656
Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656
Akey unanswered issue in human prehistory is the extent to which cultural change identiable in the archaeological record can be ascribed to the movements of people, as
opposed to the movements of just their ideas and artefacts. The Central European archaeological record identies a succession of profound cultural and economic changes between the last hunter-gatherers of the Mesolithic and the rst farmers of the Early Neolithic (ENE), through to the socially stratied chiefdoms of the Early Bronze Age13. The exact nature and genetic context of the transformative changes that took place over these four millennia remain unclear4,5, although current genetic patterns of mitochondrial DNA (mtDNA) haplogroup (hg) distribution suggest a complex series of events in European prehistory49 and hint at multiple inputs from outside Central Europe4,10,11.
Phylogeographic studies suggest that mt hg H arrived in Europe from the Near East before the Last Glacial Maximum (22,000 BP), and survived in glacial refugia in Southwest Europe before undergoing a post-glacial re-expansion4,12. Haplogroup H now accounts for over 40% of mtDNA variation in anatomically modern humans across much of Western Eurasia, with declining frequencies south and east to B1030% in the Near East and
Caucasus10. However, it remains uncertain when and how H became the dominant European hg. Traditional approaches (including ancient DNA studies) have been unable to resolve either the phylogeny or phylogeographic distribution of H subhaplogroups (sub-hgs)6, however, they have generally relied on sequencing only 300400 bp of the mt D-loop or control region10,13. A number of studies based on complete 16.6 kb human mt genomes have revealed a complex evolutionary history for hg H (for example, refs 12,1418, with phylogenetic analyses recognizing 87 H sub-hgs19). These complete mt genomes revealed that 71% of hg H polymorphic diversity is located outside the D-loop, in the coding region20 and, as a result, this diversity has not yet been exploited at the population genetics level.
To investigate the relationship between the European genetic and archaeological records, we sequenced whole hg H mt genomes from skeletal remains directly assigned to distinct Central European archaeological cultures. Owing to its excellently preserved human skeletal remains, forming a continuous record across a series of archaeological cultures since Palaeolithic times, the Mittelelbe-Saale region of Saxony-Anhalt (Germany) provided a unique opportunity to address this issue. We analysed a time transect spanning the 43,500 years of the Central European
Neolithic period (Table 1, Supplementary Table S1), from the rst farmers of the ENE linear pottery culture (LBK, 54504775 BC), through the subsequent Rssen (46254250 BC), Schningen (41003950 BC), Baalberge (39503400 BC) and Salzmnde (34003025 BC) cultures. These were followed by two of the rst pan-European Late Neolithic (LNE) cultural complexes, the Corded Ware (CWC, 28002050 BC) and Bell Beaker (BBC, 25002050 BC) cultures, before the emergence of the Early Bronze Age with the Unetice culture (22001575 BC). We chose to focus on hg H because of its recent dramatic rise in frequency to become the dominant hg in Europe, because of its presence in all Neolithic cultures in the Mittelelbe-Saale region, and the potential it provided to explore detailed genetic structure on a sub-hg level. Overall, our results suggest that the broad foundations of the Central European mtDNA pool, here approximated via hg H, were formed during the Neolithic rather than the post-glacial period.
ResultsSequence and network analyses. From a collection of over 400 European prehistoric human archaeological remains we selected
37 Mittelelbe-Saale individuals, as well as two samples from Italy (Supplementary Table S1), previously assigned to hg H by simplex and multiplex PCR7. Work was independently replicated for two samples per individual (Supplementary Methods). We
Table 1 | Summary of genotyping data against the Reconstructed Sapiens Reference Sequence (RSRS).
Culture/age Individual Hg* Hg H sequence variants compared with RSRS
LBK (54504775 BC) HAL36 H23 C10211THAL11 H T16093C, G16129A!
HAL32 H26 T11152CHAL39 H1e G3010A, G5460ADEB9 H88 A8596GDEB21 H1j G3010A, T4733CKAR6a H1bz G1719A, G3010A, C14380T KAR11b H T152C!
KAR16a H46b C2772T, A11893G
Rssen (46254475/4250 BC)
OSH2 H89 A6932G, C8068T, T12696C
OSH3 H1 G3010AOSH1 H16 T152C!, C10394TOSH7 H5b C456T, G5471A, T16304C,
C16519T
Schningen (41003950 BC)
SALZ18a H10i C13503T, T14470a, T16093C
SALZ21b H1e7 T1766C, G3010A, G5460A
Baalberge (39503400 BC)
ESP30 H1e1a5 G3010A, G5460A, (C5960T),
A8512G, G8865A, C14902T HQU4 H7d5 A4793G, C15409T, G16388A
Salzmnde (34003100/3025 BC)
SALZ57a H3 T152C!, T6776C
SALZ77a H3 T6776C
Corded Ware (28002200/2050 BC)
ESP15 H6a1a T239C, G3915A, A4727G,
G9380A, T11253C, T16362C, A16482G, C16519TBZH6 H1_TBD G3010A, A8149G, A9377G,
T9467C, A13671G, T14319C, T16189C!
Bell Beaker BZH4 H1e7 G3010A, G5460A, A15220G,
A15401G, A16293G (25002200/2050 BC) ROT6 H5a3 C456T, G513A, T4336C, G15884A,
T16304C, C16519TALB1 H3b A2581G, T6776CROT1 H3ao2 C4577T, T6776C, C16256TROT2 H5a3 C456T, G513A, T4336C, G15884A,
T16304C, C16519T QUEXII1 H4a1 C3992T, A4024G, T5004C,
G9123A, C14365T, A14582G, C16519TQUEXII2 H4a1 C3992T, A4024G, T5004C,
G9123A, C14365T, A14582G, C16519TQLB26a H1 G3010AQUEXII3 H13a1a2c C2259T, A4745G, G9025A,
A13542G, C13680T, C14872T, C16519TQLB28b H1 G3010A
Unetice (22001575 BC) BZH1 H11a T195C!, T961g, T8448C,
(G13759A), A16293G, T16311C!, C16519TBZH8 H2a1a3 G951A, G1438A, G4769A, C6173T,
T13095C, A16240t, C16354T, C16519TBZH14 H82a T195C!, A16220GEUL41a H4a1a1a5 A73G!, C3992T, A4024G,
T5004C, G8269A, G9123A, A10044G, C13545T, C14365T, A14582G, C16519TEUL57B H3 T152C!, T6776CQUEVIII4 H7h A4793G, G16213A
Nuragic Bronze Age (1624 BC)
H1aw1 G3010A, A8701G!, C15912T
Iron Age (500 BC) H90 C5435T, T8911C, T10237C,
T15109C
Abbreviation: SNP, single-nucletide polymorphism.
Sub-haplogroup dening diagnostic SNPs are shown in bold and private/as-yet-unknown sequence variants in regular print.*Haplogroup H designations based on the http://www.phylotree.org mtDNA tree Build 14(5 April 2012)19,20.
2 NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656 ARTICLE
designed and optimised a hybridisation-based DNA-capture system to sequence complete mt genomes on the Affymetrix MitoChip v2.0 (ref. 21) (Supplementary Methods, Supplementary Fig. S1,S2, and Supplementary Tables S2-S3) via immortalised libraries prepared from the highly damaged and degraded endogenous DNA recovered from archaeological remains22,23. Six of the 39 target-enriched libraries were also analysed via a single-molecule, real-time (SMRT24) Pacic Biosciences RS sequencing platform (Supplementary Table S4, Supplementary Dataset). In addition, 35/391 (9%) of all SNPs identied via the MitoChip were independently conrmed by direct PCR and Sanger sequencing (Supplementary Methods, Supplementary Tables S5 and S6). Mt genomes from all 39 individuals were unambiguously assignable to individual sub-hgs of hg H20,
conrming that a single human was typed in each case (Table 1). The mt hypervariable region I sequences matched those previously determined for each individual. The ancient hg H mt genomes were highly diverse, with 34 distinct haplotypes attributed to 20 major sub-hgs (gene diversity H 0.997 /
0.0071; nucleotide diversity 0.000421 / 0.000225), including
three novel lineages (provisionally named H88H90).
Phylogenetic network analysis of these ancient mt genomes reveals evidence of dynamic changes in the composition of H subhgs over the B3,500-year time transect (Fig. 1). Importantly, sequences from older samples (and cultures) tend to represent basal lineages, only one to three mutations away from the ancestral root of hg H, while younger samples (after B4000 BC)
largely comprise more derived haplotypes appearing on longer
a
LBK 5450-4775 BC
Schningen 4100-3950 BC
Salzmnde 3400-3100/3025 BC
(Provisional) sub-haplogroup
Rssen 4625-4475/4250 BC
C13545T
Baalberge 3950-3400 BC
Corded Ware 2800-2200/2050 BC
Bell Beaker 2500-2200/2050 BC
Unetice 2200-1575 BC
Iron Age 500 BC
15220 15401 16293
Nuragic 1624 BC
G8269A
A10044G
A73G!
H1
H4
H11
T1766C
A8512G
C14902T
(C5960T) G8865A
C14380T G1719A
A
0
6
4
5
G
A8701G! C15912T
H6
C13503T T16093C
T152C!
T4733C
T16189C!
A8149G A9377G T9467C A13671G T14319C
H1
T239C T16362C A16482G
G3915A G9380A A4727G T11253C
T5004C G9123A A4024G A14582G
C14365T
A4745G C13680T A13542G G9025A
H10
H16
T16093C
G3010A
T195C!
T8848C G13759A T16311C!
T961g A16293G
T
C10394T
C10211T
T11152C
T14470a
C3992
H13
G4769A G951A
C16354T C6173T T13095C
A16240t
b
A11893G C2772T
H46
G16129A!
C14872T
C2259T
G1438A
A2581G
C4577T C16256T
T152C!
H88
H
T6776C
Archaeological/historical estimates:European population census size
Population density Central Europe
Genetic estimates:Effective population size haplogroup H Dates of ancient samples
|
| |
|
|
| |
||
|
| || |
|
| | |
| |
|
|
|||| || | ||
|
|
| |
|
|
H2
H3
1 C5435T T8911C T10237C T15109C
Effective population size / census size estimate
1,000100,00010,000,0001,000,000,000
A8596G
T152C!
A16220G
A4793G
H82
T195C!
T
6
5
4
C
C
4
0
3
6
T
H7
H23
H26
A6932G C8068T T12696C
G16213A
C15409T G16388A
0.61.814.480
Population density (p per km2 )
H5
T4336C
G5471A
G513A
G15884A
H89 H90
12,000 10,000 8,000 6,000 4,000 2,000 0
Time (years)
Figure 1 | Mitochondrial haplogroup H sequence evolution. (a) Phylogenetic network of 39 prehistoric mitochondrial genomes sorted into two temporal groupings: Early Neolithic (left) and Mid-to-Late Neolithic (right). Node colours represent archaeological cultures. (b) A Bayesian skyride plot of 200 representative present-day and 39 ancient hg H mt genomes (the thick red line denotes the posterior median, thinner anking lines denote the 95% credibility interval; note the logarithmic scale of the y axis). Prehistoric samples (18 radiocarbon and 21 mean archaeological dates) served as internal calibration points (black bars). For comparison, census size estimates for the European population are shown as orange dots. Population density estimates from the archaeological record for key periods in Central Europe are plotted as blue squares in chronological order: LBK, Iron Age, Roman period, Merovingian and Pre-industrial modern times (y axis on the right)28.
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 3
& 2013 Macmillan Publishers Limited. All rights reserved.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656
Table 2 | Population pairwise and linearised Slatkins FSTs
and NP-MANOVA tests.
branches. This temporal relationship provides further support for the authenticity of the ancient mt genomes.
Network analysis (Fig. 1) reveals pronounced differences in the composition of sub-hgs between the ENE cultures (LBK, Rssen, Schningen), and those of the Mid Neolithic (MNE)/LNE to Early Bronze Age (Baalberge, Salzmnde, Corded Ware, Bell Beaker, Unetice). ENE (and in particular LBK) mt genomes are either rare today (H16, H23 and H26), extinct or have not yet been observed in present-day populations (H46b, H88 and H89). In sharp contrast, most of the later H sub-hgs are more common in present-day European populations (for example, hg H3, H4, H6, H7, H11 and H13)12,1416. Of the 39 haplotypes detected, only three (within the common, basal, sub-hg H1) were shared between ENE and MNE/LNE cultures. As the observed gene diversity is high, we might expect the number of shared haplotypes within and between cultures to be low. However, as the MNE/LNE haplotypes are on different sub-hg branches from the ENE haplotypes, these patterns combined show minimal local genetic continuity over this time period (Table 1).
Genetic distances. To further examine these apparent temporal shifts in sub-hg distribution, we tested whether hg H individuals represent different meta-populations by pooling them into different cultural and/or temporal groups of ENE versus LNE (Table 2, Supplementary Table S7). When pooled in four groups (ENE, MNE, LNE and Bronze Age), pairwise population comparisons via FST values based on sequence data showed that genetic distances increased with time over the duration of the Neolithic, reaching a signicant value (FST 0.08722;
P 0.00386 / 0.0006) between the ENE and the early Bronze
Age (Table 2). This suggests a transformation of hg H diversity during the Neolithic period. This effect was less apparent (nonsignicant FST values) when samples from various sites were pooled in larger temporal groups (Table 2). However, non-parametric multivariate analysis of variance (NP-MANOVA, P 0.0072) also conrmed a signicant difference between
pooled groups of ENE and LNE individuals when comparisons were based on the presence or absence of sub-hgs (Table 2).
Genetic afnities. To examine potential geographic origins for Neolithic cultures (Supplementary Table S1) and to assess their contribution to present-day Central European mtDNA diversity, we used principal component analysis (PCA) to investigate genetic afnities between three ancient culturally/temporally pooled groups (LBK, MNE and BBC) and 37 present-day Western Eurasian populations (Supplementary Table S8). PCA of the frequencies of the 15 most common H sub-hgs showed that the present-day populations form three signicantly supported geographic clusters (a grouping which was also supported using NP-MANOVA, Po0.0001; Table 2: (i) Iberia in the west;
(ii) the Caucasus, the Near East and Anatolia; and (iii) Central and Eastern Europe from the Urals to France (Fig. 2a). This particular number of clusters was also the best supported in a model-based test on sub-hg H frequencies followed by Ward clustering (Fig. 2c,d). We also used Procrustes analysis to quantify the relationship between hg H substructure and the geographic locations of both the present-day Western Eurasian and the Mittelelbe-Saale ancient populations. For this analysis, we superimposed the PCA coordinates on the geographic map of the present-day and ancient sampling locations. We found a striking resemblance between the genetic and geographic maps, with a highly signicant Procrustes similarity score (t0 0.733) obtained
for the comparison (Po10 6; 100,000 permutations). The analysis supported a clustering of the transformed genetic data from present-day populations into the three major groups
described above (Fig. 2b). In contrast, Procrustes analysis clearly showed that the genetic data for LBK and BBC samples were not related to their geographic location. Although all three ancient groups were sampled from the same Central European location only the MNE group genetically resembles present-day populations from this region.
The combined set of analyses (PCA, Procrustes and Ward clustering) revealed that Mittelelbe-Saales earliest farmers (LBK; n 9) cluster with present-day Caucasus, Near Eastern and
Anatolian populations, as previously noted7. In contrast, individuals from the successor series of regional post-LBK (and MNE) Rssen, Schningen, Baalberge and Salzmnde cultures (ca. 46253025 BC, MNE; n 10) cluster with present-day
Central European populations (Fig. 2). Mitochondrial genomes from BBC individuals in Mittelelbe-Saale (BBC; n 7) display
close genetic afnities to present-day Iberian populations (Fig. 2). The component loadings of the PCA biplot indicate that this is largely based on high frequencies of sub-hgs H1 and H3, which are thought to have spread from a glacial Iberian refugium13 and which have also been reported from ancient Neolithic sites from France and Spain8,25. Other LNE samples add further to the genetic complexity. Individuals from the CWC (28002200 BC), which has archaeological associations towards North-Eastern Europe, produced two distinct mt genomes (H1_TBD and H6a1a), which have not been found in their contemporaneous Bell Beaker neighbours, nor in preceding Central European cultures. Similarly, data from the subsequent Early Bronze Age
(a) NP-MANOVA four time periods (P 0.0696)
Early Neolithic Middle Neolithic Late Neolithic Bronze Age Early Neolithic (13) 0 0.1262 0.024 0.0574 Middle Neolithic (6) 0.7572 0 0.8575 0.7782 Late Neolithic (9) 0.144 1 0 0.742 Bronze Age (6) 0.3444 1 1 0
FST four time periods FSTEarly Neolithic Middle Neolithic Late Neolithic Bronze Age Early Neolithic (13) 0 0 0.03179 0.09555 Middle Neolithic (6) 0.01135 0 0 0.02299 Late Neolithic (9) 0.02247 0.01165 0 0.01148
Bronze Age (6) 0.08722 0.03081 0.02250 0(b) NP-MANOVA LBK, BBC and pooled intermediate Neolithic (MNE) as used in PCA (P 0.2355)
LBK MNE BBC LBK (9) 0 0.2084 0.0916 MNE (10) 0.6252 0 0.8025 BBC (7) 0.2748 1 0
FST LBK, BBC, and pooled intermediate Neolithic (MNE) as used in PCALBK MNE BBC LBK (9) 0 0 0.03369 MNE (10) 0.02587 0 0
BBC (7) 0.03260 0.00704 0(c) NP-MANOVA two time periods (P 0.0072)
Early Neolithic Late Neolithic Early Neolithic (13) 0 0.0109 Late Neolithic (16) 0.0109 0
FST two time periodsEarly Neolithic Late Neolithic Early Neolithic (13) 0 0.01459 Late Neolithic (16) 0.01438 0
(d) NP-MANOVA Cultures grouped with geographic regions as in Fig. 2 (Po0.0001) Iberia Near East Mainland Europe
Iberia 0 0 0.0001 Near East 0 0 0.0004 Mainland Europe 0.0003 0.0012 0
Abbreviations: BBC, Bell Beaker culture; MNE, Mid Neolithic; NP-MANOVA, non-parametric multivariate analysis of variance; PCA, principal component analysis.
Neolithic samples pooled in different time periods: (a) four time periods; (b) three time periods;(c) two time periods, and (d) from cultures grouped with geographic regions. For NP-MANOVA, Bonferroni corrected values are given in bold print and areas shaded grey indicate signicant values (Po0.05). Slatkin F s are italicised (upper diagonal) and signicant pairwise distances are given in bold print (lower diagonal).
4 NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656 ARTICLE
a b
3
H1b
KBK
H5
H1a
H11
H7
SVK
WIS
CZE
CAN
H5a
MNE
DEU
AUT
ROU
H6a
VUR
MKD
IPNE
FIN
EST
H3
CNT
CAT
GAS
NWC
GEO
H1
FRA
BLK
ESV
POT
PAS
ARE
FRM
LBN
TUR
ARM
BMI
VIZ
LBK
BBC
H
JOR
GAL
GUI
H2a1
H8
H6
H4
SYR
H2
ARB
DAG
OSS
0.2
KBK
SVK
DEU
AUT
ROU
EST
ESV
WIS
CZE
MNE
CAN
POT
PAS
IPNE
VUR
MKD
GEO
CNT
CAT
GAS
FIN
FRM
FRA
BLK
LBN
NWC
ARE
BMI
TUR
ARM
BBC
VIZ
GUI
GAL
LBK
JOR
SYR
ARB
DAG
OSS
2
PC 2 - 11.8 % variance
0.1
1
0
Dimension 2
0.0
1
0.1
2
0.2
3
3 2 1 0 1 2 3
0.2 0.1 0.0 0.1 0.2
PC 1 - 23.7 % variance
Dimension 1
c d
400
3,400
300
Height
3,600
200
BIC
3,800
100
81
71
82
4,000
EII VII EEI VEI EVI
VVI EEE EEV VEV VVV
0
DAG
KBK
FIN
SVK
BLK
DEU
SYR
ARB
ARM
TUR
NWC
OSS
GEO
LBN
MKD
LBK
JOR
ARE
VUR
EST
ESV
AUT
FRM
IPNE
CZE
CAT
WIS
ROU
FRA
MNE
PAS
POT
VIZ
BBC
CAN
GAL
CNT
GUI
GAS
BMI
4,200
d
2 4 6 8
Number of components
Figure 2 | Population afnities of select Neolithic cultures. (a) PCA biplot based on the frequencies of 15 hg H sub-haplogroups (component loadings) from 37 present-day Western Eurasian and three ancient populations (light blue: Western Europe; dark blue: Central and Eastern Europe; orange; Near East, Caucasus and Anatolia; and pink: ancient samples). Populations are abbreviated as follows: GAL, Galicia; CNT, Cantabria; CAT, Catalonia; GAS, Galicia/Asturia; CAN, Cantabria2; POT, Potes; PAS, Pasiegos; VIZ, Vizcaya; GUI, Guipuzcoa; BMI, Basques; IPNE, Iberian Peninsula Northeast; TUR, Turkey; ARM, Armenia; GEO, Georgia; NWC, Northwest Caucasus; DAG, Dagestan; OSS, Ossetia; SYR, Syria; LBN, Lebanon; JOR, Jordan; ARB, Arabian Peninsula; ARE, Arabian Peninsula2; KBK, Karachay-Balkaria; MKD, Macedonia; VUR, Volga-Ural region; FIN, Finland; EST, Estonia; ESV, Eastern Slavs; SVK, Slovakia; FRA, France; BLK, Balkans; DEU, Germany; AUT, Austria, ROU, Romania; FRM, France Normandy; WIS, Western Isles; CZE, Czech Republic; LBK, Linear pottery culture; BBC, Bell Beaker culture; MNE, Middle Neolithic. (b) Procrustes analyses of geographic coordinates and PCA scores of the same data set (similarity score t0 0.733, Po10 6, 100,000 permutations). (c) Ward clustering dendrogram of the three ancient groups and present-day
populations (colour code as above and p values in % of approximately unbiased bootstrapping for the following three main clusters). (d) Results of the model-based test to identify the number of clusters by the model with the highest support (highest Bayes Information criterion (BIC); VEV multivariate
mixture model (ellipsoidal, equal shape)).
Unetice culture revealed haplotypes with genetic afnities to both the East (sub-hg H2a, H7 and H11) and the West (sub-hg H3 and H4), based on frequency distributions of these sub-hgs in present-day populations13. We also included two individuals from outside Central Europe (Sardinia and South Tyrol) and from different time periods (Nuragic Bronze Age and Iron Age, respectively) to further investigate genetic diversity within hg H and to test the power of resolution of complete mt genomes. Both individuals had mt genomes that are not found in samples from the Mittelelbe-Saale region. The Iron Age sample from South Tyrol
produced another new sub-hg (provisional H90) and the Bronze Age individual from Sardinia a new H1 haplotype (H1aw1).
Reconstructing the demographic history of mtDNA hg H. It has previously proved difcult to use present-day data alone to determine when hg H became the predominant hg in Europe, as archaeogenetic and palaeodemographic reconstructions have very large uncertainties4,26. However, as our 39-dated ancient mt genome sequences provide precise temporal calibration points,
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 5
& 2013 Macmillan Publishers Limited. All rights reserved.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656
we performed a Bayesian skyride analysis with 200 random present-day mt genome sequences to reconstruct the lineage history of hg H through time (with the caveat of assuming a continuous and panmictic population). The resulting skyride plot (Fig. 1b) is the rst real-time estimation of the European hg H population size (and consequently its contribution to Europes effective population size and demographic history) with a broad temporal coverage over B3,500 years of the Neolithic period in
Central Europe (55002000 BC). Hg H shows a consistent and strong exponential growth over the entire course of the Neolithic. The estimated population size tracks the European census size27 and population density estimates from archaeological sites28 in the Late Holocene, but also provides detailed estimates for prehistoric times for which data points remain very scarce (Fig. 1b).
Another major advantage of the temporal calibration points provided by ancient hg H mt genomes is that the data allow a relatively precise estimate of the evolutionary substitution rate for human mtDNA. The temporal dependency of evolutionary rates predicts that rate estimates measured over short timespans will be considerably higher than those using deep fossil calibrations, such as the human/chimpanzee split at B6 million years29. The rate calibrated by the Neolithic and Bronze Age sequences is2.4 10 8 substitutions per site per year (1.73.2 10 8; 95%
high posterior density) for the entire mt genome, which is 1.45
(44.5%) higher than current estimates based on the traditional human/chimp split (for example, 1.66 10 8 for the entire
mt genome30 and 1.26 10 8 for the coding region31).
Consequently, the calibrated Neolithic rate infers a considerably younger coalescence date for hg H (10.919.1 kya) than those previously reported (19.221.4 kya for HVSI10, 15.722.5 kya for the mt coding region31 or 14.722.6 kya when corrected for purifying selection30).
DiscussionDespite recent successes in sequencing portions of nuclear genomes from Meso- and Neolithic samples11,32,33, mtDNA remains the most widely studied and best described marker in population genetics. Although its interpretation is limited to the matrilineal genetic history4,13,19, this can be an important socio-cultural and demographic signal additional to that gained from autosomal loci34,35. Our results clearly demonstrate that high-resolution full mt genome-typing, combined with the ability to analyse large numbers of individuals from multiple cultural layers, can provide highly resolved temporal views that are not yet practical with nuclear DNA studies.
The phylogenetic network analysis of our chronological hg H mt genome data set (Fig. 1a) provides the rst detailed real-time view of mutations in human mtDNA. It has enabled the direct observation of the mutation rate over thousands of years and revealed a distinct temporal distribution pattern of hg H diversity. Although a temporal pattern could be expected in an expanding population with stable/increasing hg H frequencies (Fig. 1b, Supplementary Fig. S3), ENE and MNE/LNE/Bronze Age samples clearly show a mutually exclusive sub-hg distribution with the exception of sub-hg H1, which is the most common and basal sub-hg within H14-16. Under an assumption of genetic continuity, we would expect MNE/LNE and Bronze Age individuals to be on the same sub-hg branches as ENE individuals. Instead, ENE mt genomes are generally either rare today19 or have not yet been observed in present-day populations, possibly owing to subsequent extinction of these lineages. In contrast, most MNE/ LNE and Bronze Age sub-hgs are still common today. This suggests that individuals from the ENE made a marginal contribution to LNE and present-day hg H diversity. Although
the relatively small sample numbers from each time period limit detailed analyses of the causes of the distribution shifts, we interpret this phylogenetic pattern as a genetic discontinuity between Early and subsequent Neolithic cultures in Europe, potentially mirroring genetic structure in Neolithic European populations. Genetic drift could also have played a role in generating discrepant hg distributions over time and space. However, if drift was the sole cause we would expect a random distribution across all sub-hgs rather than a clear distinction between ENE and MNE/LNE/Bronze Age mt genomes.
Our genetic distance data also indicate minimal local genetic continuity between the ENE and the MNE/LNE in Central Europe (Fig. 1; Table 1), again suggesting that ENE lineages were largely superseded during the MNE/LNE (B41002200 BC) in a previously unrecognised major genetic transition. This pronounced genetic changeover between ENE and MNE/LNE cultures is comparable to other known major genetic transition, thus far revealed by ancient DNA and coalescent simulations (between indigenous European hunter-gatherers and incoming early farmers from the Near East during the initial Meso-Neolithic transition from B7500 BC in Central Europe)6,7. When compared with hg H diversity in present-day Central Europe14,15,18,36, the network in Fig. 1 suggests that much of the present-day diversity can be attributed to the incorporation of new lineages in the MNE/LNE and emerging Bronze Age (from 2200 BC). The LNE in particular is known to have been a period of profound cultural and economic change37, with newly emerging pan-European cultures such as the Bell Beaker phenomenon in Western Europe and the Corded Ware culture in north-eastern Europe. It therefore seems likely that these pan-European cultures were associated with the introduction of lineages from outside Central Europe. Fortunately, the ranges of both these groups overlapped in the Mittelelbe-Saale sample area (Supplementary Methods), allowing this possibility to be further investigated.
Our data on genetic afnities (PCA, Procrustes and Ward clustering) revealed that Mittelelbe-Saales earliest farmers (LBK; n 9) cluster with present-day Caucasus, Near Eastern, and
Anatolian populations. These ndings are consistent with a highly detailed archaeological record tracing the temporal and spatial spread of agriculture into Central Europe; beginning initially in Anatolia and the Near East, where farming originated B12,000 years ago7.
Our observation that individuals from the successor series of regional post-LBK and MNE cultures (Rssen, Schningen, Baalberge and Salzmnde) cluster with present-day Central European populations could be explained by a loss of lineages from the ENE LBK period during a short phase of population decline in the centuries after 5000 BC (as proposed in some archaeological models)38. However, our results suggest that mtDNA H sub-hg diversity established during the MNE is still present in Central European populations today. This is consistent with independent archaeological evidence of a phase of more localised cultural development during the MNE period, potentially involving inuences from contemporaneous MNE cultures outside Mittelelbe-Saale, which (perhaps in concert with LBK population decline) could have resulted in a replacement of most ENE H sub-hgs. Together, the genetic and archaeological evidence highlight the complexities of both the formative and consolidation phases in Central Europe.
From around 2800 BC, the LNE Bell Beaker culture emerged from the Iberian Peninsula to form one of the rst pan-European archaeological complexes. This cultural phenomenon is recognised by a distinctive package of rich grave goods including the eponymous bell-shaped ceramic beakers. The genetic afnities between Central Europes Bell Beakers and present-day Iberian
6 NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656 ARTICLE
populations (Fig. 2) is striking and throws fresh light on long-disputed archaeological models3. We suggest these data indicate a considerable genetic inux from the West during the LNE. These far-Western genetic afnities of Mittelelbe-Saales Bell Beaker folk may also have intriguing linguistic implications, as the archaeologically-identied eastward movement of the Bell Beaker culture has recently been linked to the initial spread of the Celtic language family across Western Europe39. This hypothesis suggests that early members of the Celtic language family (for example, Tartessian)40 initially developed from Indo-European precursors in Iberia and subsequently spread throughout the Atlantic Zone; before a period of rapid mobility, reected by the Beaker phenomenon, carried Celtic languages across much of Western Europe. This idea not only challenges traditional views of a linguistic spread of Celtic westwards from Central Europe during the Iron Age, but also implies that Indo-European languages arrived in Western Europe substantially earlier, presumably with the arrival of farming from the Near East41.
Other LNE population movements appear to have added further genetic complexity, as exemplied by the CWC (2800 2200 BC), which preceded the Bell Beaker culture in Mittelelbe-Saale and has archaeological associations with North-Eastern Europe. A genetic afnity to eastern populations is consistent with two distinct CWC mt genomes (H1_TBD and H6a1a) not identied in either their contemporaneous Bell Beaker neighbours or in preceding Central European cultures. The subsequent Early Bronze Age Unetice culture, associated with emerging metallurgy and increasingly stratied societies37,42, marks a consolidation of social and cultural systems in Mittelelbe-Saale that were established during the LNE by the two pan-European Bell Beaker and CWCs. The Unetice culture appears contemporaneously with the last Neolithic horizon (B2200 BC) in areas where elements of both the Bell Beaker and CWCs are present, sometimes overlapping at the same sites. It is therefore not surprising that individuals ascribed to the newly emerging Unetice culture carry mt genomes with both Western (sub-hgs H3 and H4) and Eastern (sub-hgs H2a, H7 and H11) associations.
The demographic reconstruction, which is based on direct calibration points, has major implications for understanding post-glacial human history in Europe. Our new estimate is incompatible with traditional views that the majority of present-day hg H lineages were carried into Central, Northern and Eastern Europe via a post-glacial human population expansion before the Holocene (12 kya)13. Our data complement a recent study, based on present-day mt genomes, which describes a pronounced population increase at B7000 BC (interpreted as a
Neolithic expansion into Europe), but followed by a slow population growth until the present day26. By including ancient DNA data from across the critical time points in question, our skyride plot corrects for missing temporal data and suggests substantial growth of hg H from the beginning of the Neolithic and continuing throughout the entire Neolithic period. This emphasizes the role of farming practices and cultural developments in the demographic expansions inferred in subsequent time periods, which have not yet been explored genetically.
Although an expansion of hg H could in principle be compatible with a post-glacial resettling of Northern and Central Europe from southwestern refugia12,16 (as indicated by our population skyride and PCA plots), we instead propose that the rise of hg H to become the predominant mtDNA branch in Europe was mediated by subsequent demographic events during the Neolithic, as shown by a general increase in hg H frequency and strong population growth during this period (Fig. 1b).
Support for this position comes from data suggesting that hg H was virtually absent among Central and Northern European hunter-gatherers6,43 and formed only 19% in LBK individuals, most likely introduced from Southeast Europe and/or the Near East7. In our updated data set from Mittelelbe-Saale, hg H appears to have been established by the LBK period and increased in frequency after 4000 BC (Supplementary Fig. S3). Interestingly, MNE/LNE cultures with cultural associations to the North and Northeast, such as the Bernburg and CWCs, show reduced hg H frequencies and hg H only moved northwards into southern Scandinavia during the Neolithisation of Northern Europe around the Middle Neolithic, as exemplied by individuals from the Funnel Beaker Culture11,43. However, hg H appears to have been generally more frequent in prehistoric Western Europe: at 20% from a Middle Neolithic (30302890 calBC) site in France9; at B25% from Iberian (Epi-)Cardial Neolithic samples8,25; at 36% from a Neolithic site in Catalonia44; and at 44% from Neolithic sites from the Basque Country and Navarre45. Importantly, a recent study on Iberian hunter-gatherers revealed the presence of hg H there in Mesolithic times45. In Mittelelbe-Saale, the Bell Beaker samples signpost a signicant increase in hg H frequency (the 95% condence intervals do not overlap with earlier LBK and Schningen Neolithic cultures; Supplementary Fig. S3). In conclusion, the Western European Neolithic and the widespread pan-European Bell Beaker phenomenon appear to be important factors in driving the spread of H sub-hgs throughout large parts of Western Europe. In particular, high proportions of sub-hgs H1 and H3 seem to have made substantial contributions to the hg H diversity that exists in Western and Central Europe today16. Having reached signicant levels, and assuming a generally higher rate of population growth in southern and western Europe in post-Neolithic times27, these Neolithic processes appear to have been the major factor in hg H becoming the predominant European mtDNA hg.
Overall, our results suggest that the broad foundations of the Central European mtDNA pool, here approximated via hg H, were formed during the Neolithic rather than the post-glacial period. ENE hg H mt lineages brought in from the Near East by Central Europes rst farmers do not appear to have contributed signicantly to present-day Central Europes hg H diversity, instead being largely superseded during the MNE and LNE (with the process starting around 4000 BC), after which there appears to have been substantial genetic continuity to the present-day in Central Europe. These developments have been revealed by comparative full mt genome sequencing and would have remained obscure using standard HVS I data.
In conclusion, demographic changes across the MNE, followed by the widespread Bell Beaker cultural phenomenon, are likely to have been the key factors in the expansion of hg H across Western Europe and the eventual rise of hg H to become the predominant mtDNA hg. However, LNE Corded Ware and Early Bronze Age data suggest a complex series of additional genetic contributions, which require further investigation.
Methods
Ancient DNA analyses. DNA was extracted from two independent samples for each individual (Supplementary Methods). HVS I was amplied using a minimum of four short overlapping primer pairs, following established protocols and authentication criteria as described previously7,46. Multiplex SNP typing of 22 hg informative SNPs (GenoCoRe22) was carried out using a SNaPshot-based protocol as described previously7.
Ancient DNA Library preparation. Ancient DNA extract polishing, phosphor-ylation, adaptor ligation and polymerase ll-in reactions were used sequentially to create fully double-stranded adaptor-tagged aDNA libraries (Fig. 3). Following every step, DNA was puried using MinElute spin columns (Qiagen) as per the
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 7
& 2013 Macmillan Publishers Limited. All rights reserved.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656
Preparation of probes aDNA library preparation
Long-range PCR
Sonication/fragmentation
mtDNA baits/probes
aDNA extract
End polish/repair
P
P
P
P
Phosphorylation
Adapter ligation/fill in
DNA library/immortalisation
P P P
P
P
P
Biotinylation
P
Target enrichment and complexity reduction
1. Immortalised aDNA library
5. Enrichment via low cycle PCR
2. Hybridisation
Cycle
3. Capture, wash and elution
4. Strand displacement/fill in
QC
Sequencing platformRepeat hybridisation/enrichment
Figure 3 | Schematic representation of experimental steps. (a) Probe DNA was prepared by amplifying a complete mitochondrial genome in two overlapping fragments by long-range PCR, followed by DNA fragmentation and biotinylation to form mtDNA baits for targeted hybridisation. (b) Ancient DNA was enzymatically blunt-ended and phosphorylated, ligated to custom library adaptors, followed by polymerase ll-in to create immortalised double-stranded DNA libraries. (c) Hybridisation-based DNA-capture using biotinylated probe bound to Streptavidin magnetic beads; following stringency washes, captured library constructs enriched in mtDNA sequences are eluted from the beads/probe via a novel polymerase strand-displacement reaction followed by PCR library reamplication. These steps can be carried out iteratively to maximise mtDNA content in enriched libraries (see Supplementary Methods for full details).
manufacturers instructions. PCR amplication reactions were then performed to create primary DNA libraries, ready for DNA-capture hybridisation steps, and amplication products were sized and quantied (Supplementary Methods).
Hybridisation-based enrichment of human mtDNA. The basic conceptual design for the hybridisation of tracer DNA (aDNA library) to biotinylated driver DNA sequences (human mt probe) was previously described47 and the overall scheme is outlined in Fig. 3. The two library-specic PCR primers were included as part of the hybridisation mix as blocking oligonucleotides to minimise unwanted hybridisation between the adaptor-tagged anking regions of otherwise unrelated single-stranded library DNA molecules48. A key innovation of this methodology was the use of a DNA polymerase with strand-displacing activity after post-hybridisation stringency washes. This allowed primer extension from the bound library (blocking) primers to disrupt the double-stranded region of stable hybridisation between human mt probe DNA sequences and single-stranded library DNA molecules that had inserts with complementary sequences. These mtDNA-enriched library DNA molecules captured in the hybridisation step could thereby be cleanly separated from biotinylated probe molecules, which remained
bound to magnetic Streptavidin beads. PCR reamplication reactions from the mt-enriched library DNA molecules comprised the rst enrichment DNA libraries. In general, we used three cycles of hybridisation/enrichment/reamplication to produce DNA libraries highly enriched for short endogenous mtDNA sequence fragments ready for genotyping (Supplementary Methods).
Affymetrix Mitochip v2.0 array typing and Pacic Biosciences SMRT sequencing. MtDNA-enriched libraries underwent biotin labelling using terminal deoxynucleotidyl transferase (TdT) as per the Affymetrix GeneChip Whole-Transcript Sense Target Labelling Assay Manual (P/N 701880, rev. 4). Biotin-labelled DNA libraries were hybridised to Affymetrix GeneChip Human mt Resequencing 2.0 Arrays for 17 h at 49 C. Arrays were washed, stained and scanned as per the GeneChip CustomSeq Resequencing Array Protocol (P/N 701231, rev. 5). Affymetrix GeneChip Command Console software (v3.2) was used to generate CEL les, which were then analysed using GeneChip Sequence Analysis Software (GSEQ v4.1, Affymetrix) and validated using the software Geneious49 (Supplementary Fig. S1,S2, Supplementary Tables S2,S3). Six of the mt-enriched
8 NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656 ARTICLE
DNA libraries were also converted to SMRTbell template libraries for sequencing on a Pacic Biosciences RS platform (Supplementary Methods).
Network analyses. A median joining network of all ancient hg H mt genomes (Fig. 1a) was constructed manually using the most up-to-date version of the mt phylogenetic tree (PhyloTree.org, mtDNA tree Build 14) as a scaffold on which to place the observed hg H lineages19,20. This version included a revised version of the hg H sub-tree comprising 1203 sequences in total. As per convention, insertions at np 309.1C(C), 315.1C, 523-524d (aka 522-523d), 16182C, 16183C, 16193.1C(C) and mutation 16519 were not considered for phylogenetic reconstruction20.
Procrustes-based PCA and Ward Clustering. PCA was used to describe and visualise the maternal genetic relationships among the Neolithic cultures investigated, as well as to 37 present-day European and Near Eastern populations(Fig. 2a). PCA was performed on the frequency of H sub-hgs taken from the literature (Supplementary Table S8). To minimise statistical noise caused by rare sub-hgs and to allow for data compatibility across published studies, we considered only the following 15 most common H sub-hgs in Europe and the Near East: H*, H1, H1a, H1b, H2, H2a1, H3, H4, H5, H5a, H6, H6a, H7, H8 and H11. PCAs were performed and visualised in R version 2.11.1 (ref. 50) using a customised script based on the function prcomp.
Ancient hg H individuals were pooled into three different groups based on the numbers of samples available: two for pan-European archaeological phenomena/ cultures alongside hypothesised geographic origins (LBK, n 9 and BBC, n 7);
and a temporally transitional group pooling regional (mostly MNE) cultures (MNE, n 10). Small sample sets such as the Corded Ware (n 2) and later
Bronze Age Unetice (n 5) were excluded. To test whether the clustering pattern
observed in the PCA was signicantly supported, we performed a number of statistical tests including Ward clustering, Procrustes analysis and NP-MANOVA (as described below). First, we performed model-based cluster tests to identify the number of clusters via the model with the best support (highest Bayes Information criterion) followed by Ward hierarchical clustering of sub-hg H frequencies using the packages mclust, pvclust (for bootstrap values) and hclust in R, respectively. Procrustes analysis was also performed in R using the package vegan based on PCA scores and geographic coordinates (Supplementary Table S8) and the function protest to calculate the similarity score (100,000 permutations).
Summary statistics. Population pairwise FST, Slatkins linearised FST and haplo-type diversity were calculated in Arlequin version 3.5 (ref. 51). We used jMODELTEST 0.1.152 in order to nd the best tting evolutionary model and, if required, to estimate a discrete g shape parameter for our 39 non-partitioned mt genomes. Based on the resulting scores for each model (AIC and Bayes Information criterion), we subsequently used the Tamura and Nei model and a g value of 0.049 for our calculations of population distances in Arlequin. The ancient hg H individuals were pooled into different temporal/cultural groups in order to calculate genetic diversity indices and to test for genetic differentiation (Table 2, Supplementary Table S7).
Multivariate analysis of variance. We performed a NP-MANOVA to test whether the temporal grouping of ancient individuals according to archaeological time periods are statistically supported. The NP-MANOVA was performed on a Raup-Crick distance matrix, which was produced from the presence/absence of the 15 hg H sub-hgs used in the PCA. Calculations were performed in PAST version2.09 with 10,000 permutations per test and post hoc Bonferroni correction to account for multiple comparisons and small sample sizes53. We also tested whether the clustering pattern between the ancient and present-day populations observed in the PCA was signicantly supported.
Bayesian skyride analyses and mutation rate calculation. The data set comprised 37 newly sequenced, non-related, ancient mt genomes, ve sets of randomly chosen, distinct, present-day hg H mt genomes from Phylotree (http://www.phylotree.org
Web End =http://www. http://www.phylotree.org
Web End =phylotree.org , mtDNA tree Build 12 (20th July 2011)) and 420 newly available hg H sequences17. The Sequences were manually aligned to the revised Cambridge Reference Sequence (rCRS: AC_000021)54 using the program SeaView55. The alignment was partitioned into four subsets, representing the D-loop, the protein-coding regions (1st 2nd codon positions and 3rd codon position) and a
concatenation of tRNA and RNA genes. Insertions at nps 309.1C(C), 315.1C, 523-524d (aka 522-523d), 16182C, 16183C, 16193.1C(C) were not considered for phylogenetic reconstruction and position 16519 was removed from the D-loop subset20. The best substitution models were selected using ModelGenerator 0.85 (ref. 56). by comparison of Bayesian Information Criterion scores: HKY G for
D-loop, TN G for protein-coding regions and HKY for RNA genes. Considering
the short evolutionary timescale being studied (intra-hg), models including a proportion of invariant sites were excluded. A Bayesian skyride analysis57 was performed using the phylogenetic software BEAST 1.6.1 (ref. 58). and calibrated using radiocarbon dates from 18 of the ancient individuals and mean archaeological dates for the remaining individuals. This allowed us to achieve a broad temporal coverage for B3500 years of the Neolithic period in Central
Europe (55002000 BC) and to generate the most precise demographic reconstruction of hg H. Results were replicated using independent sets of 100(1 ), 200 (3 ), and 300 (1 ) mt genomes. A strict molecular clock was used,
allowing for a distinct rate in each subset of the alignment. Additional analysis using an uncorrelated log normal relaxed clock to account for potential rate variations could not reject the strict clock assumption. Convergence was checked by sampling from two independent Markov chains. Each MCMC analysis was run for 100,000,000 steps and samples from the two chains were combined, after discarding the rst 10% of samples as burn-in. All parameters showed sufcient sampling, indicated by effective sample sizes above 200. Tracer 1.5 was used to produce the skyride plot (Fig. 1b)59.
We carried out a date randomisation test, to test whether the signal from the radiocarbon dates associated with the ancient sequences was sufcient to calibrate the hg H phylogeny60. This test randomises all dates associated with the sequences (including present-day ones) and replicates of the phylogenetic analysis as described above. If the structure and spread of the ancient sequences in the tree were sufcient to calibrate the analysis, the inferred mean rate of the randomised analysis should be signicantly different from the rate calculated using the correct association date/sequence. In other words, the 95% HPD of the randomised analysis should not overlap with the mean rate estimated without randomisation. The comparison of estimated rates from the main analysis and from 10 replicates with randomised dates presented in Supplementary Fig. S4 conrms the presence of sufcient signal to calibrate the tree provided by dates from the 37 ancient samples.
References
1. Whittle, A. W. R. & Cummings, V. Going over: The Mesolithic-Neolithic Transition in North-West Europe 632 (Oxford University Press, Oxford, 2007).
2. Sherratt, A. Plough and pastoralism: aspects of the secondary products revolution. Patterns of the Past: Studies in honour of David Clarke.
In: Hodder, I., Isaac, G. & Hammond, N. (eds). 261305 (Cambridge University Press, Cambridge, 1981).
3. Bogucki, P. I. & Crabtree, P. J. Ancient Europe 8000 B.C.-A.D. 1000: Encyclopedia of the Barbarian World, 1221p (Charles Scribners Sons, 2004).
4. Soares, P. et al. The archaeogenetics of Europe. Curr. Biol. 20, R174R183 (2010).
5. Pinhasi, R., Thomas, M. G., Hofreiter, M., Currat, M. & Burger, J. The genetic history of Europeans. Trends genet. 28, 496505 (2012).
6. Bramanti, B. et al. Genetic discontinuity between local hunter-gatherers and central Europes rst farmers. Science 326, 137140 (2009).
7. Haak, W. et al. Ancient DNA from European Early Neolithic Farmers Reveals Their Near Eastern Afnities. PLoS Biol. 8, e1000536 (2010).
8. Gamba, C. et al. Ancient DNA from an Early Neolithic Iberian population supports a pioneer colonization by rst farmers. Mol. Ecol. 21, 4556 (2012).
9. Lacan, M. et al. Ancient DNA reveals male diffusion through the Neolithic Mediterranean route. Proc. Natl Acad. Sci. USA 108, 97889791 (2011).
10. Richards, M. et al. Tracing European founder lineages in the Near Eastern mtDNA pool. Am. J. Hum. Genet. 67, 12511276 (2000).
11. Skoglund, P. et al. Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe. Science 336, 466469 (2012).
12. Pereira, L. et al. High-resolution mtDNA evidence for the late-glacial resettlement of Europe from an Iberian refugium. Genome Res. 15, 1924 (2005).
13. Torroni, A., Achilli, A., Macaulay, V., Richards, M. & Bandelt, H. J. Harvesting the fruit of the human mtDNA tree. Trends Genet. 22, 339345 (2006).
14. Roostalu, U. et al. Origin and expansion of haplogroup H, the dominant human mitochondrial DNA lineage in West Eurasia: The near eastern and Caucasian perspective. Mol. Biol. Evol. 24, 436448 (2007).
15. Loogvali, E. L. et al. Disuniting uniformity: a pied cladistic canvas of mtDNA haplogroup H in Eurasia. Mol. Biol. Evol. 21, 20122021 (2004).
16. Achilli, A. et al. The molecular dissection of mtDNA haplogroup H conrs that the Franco-Cantabrian glacial refuge was a major source for the European gene pool. Am. J. Hum. Genet. 75, 910918 (2004).
17. Behar, D. M. et al. The Basque paradigm: genetic evidence of a maternal continuity in the Franco-Cantabrian region since Pre-Neolithic times. Am. J. Hum. Genet. 90, 486493 (2012).
18. Alvarez-Iglesias, V. et al. New population and phylogenetic features of the internal variation within mitochondrial DNA macro-haplogroup R0. PLoS ONE 4, e5112 (2009).
19. Behar, D. M. et al. A "Copernican" reassessment of the human mitochondrial DNA tree from its root. Am. J. Hum. Genet. 90, 675684 (2012).
20. van Oven, M. & Kayser, M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum. Mutat. 30, E386E394 (2009).
21. Hartmann, A. et al. Validation of microarray-based resequencing of 93 worldwide mitochondrial genomes. Hum. Mutat. 30, 115122 (2009).
22. Maricic, T., Whitten, M. & Paabo, S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5, e14004 (2010).
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 9
& 2013 Macmillan Publishers Limited. All rights reserved.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656
23. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376380 (2005).
24. Korlach, J. et al. Real-time DNA sequencing from single polymerase molecules. Methods Enzymol. 472, 431455 (2010).
25. Lacan, M. et al. Ancient DNA suggests the leading role played by menin the Neolithic dissemination. Proc. Natl Acad. Sci. USA 108, 1825518259 (2011).
26. Fu, Q., Rudan, P., Paabo, S. & Krause, J. Complete mitochondrial genomes reveal Neolithic Expansion into Europe. PLoS ONE 7, e32473 (2012).
27. Livi-Bacci, M. A Consice History of World Population 279 (Blackwell Publishing, Malden, Oxford, Carlton, 2007).
28. Zimmermann, A., Hilpert, J. & Wendt, K. P. Estimations of population density for selected periods between the Neolithic and AD 1800. Hum. Biol. 81, 357380 (2009).
29. Ho, S. Y., Shapiro, B., Phillips, M. J., Cooper, A. & Drummond, A. J. Evidence for time dependency of molecular rate estimates. Syst. Biol. 56, 515522 (2007).
30. Soares, P. et al. Correcting for purifying selection: an improved human mitochondrial molecular clock. Am. J. Hum. Genet. 84, 740759 (2009).
31. Mishmar, D. et al. Natural selection shaped regional mtDNA variation in humans. Proc. Natl Acad. Sci. USA 100, 171176 (2003).
32. Keller, A. et al. New insights into the Tyrolean Icemans origin and phenotype as inferred by whole-genome sequencing. Nat. Commun. 3, 698 (2012).
33. Sanchez-Quinto, F. et al. Genomic afnities of two 7,000-year-old iberian hunter-gatherers. Curr. Biol. 22, 14941499 (2012).
34. Behar, D. M. et al. The Basque paradigm: genetic evidence of a maternal continuity in the Franco-Cantabrian region since pre-Neolithic times. Am. J. Hum. Genet. 90, 486493 (2012).
35. Gunnarsdottir, E. D. et al. Larger mitochondrial DNA than Y-chromosome differences between matrilocal and patrilocal groups from Sumatra. Nat. Commun. 2, 228 (2011).
36. Brandstatter, A. et al. Dissection of mitochondrial superhaplogroup H using coding region SNPs. Electrophoresis 27, 25412550 (2006).
37. Heyd, V. Families, prestige goods, warriors & complex societies: Beaker groups of the 3rd millennium cal BC along the upper & middle Danube. Proc. Prehist. Soc. 73, 327379 (2007).
38. Shennan, S. & Edinborough, K. Prehistoric population history: from the late glacial to the late neolithic in central and northern Europe. J. Archaeol. Sci. 34, 13391345 (2007).
39. Cunliffe, B. & Koch, J. T. (eds) Celtic from the West: Alternative Perspectives from Archaeology, Genetics, Language and Literature, 384 (Oxbow Books, Oxford, 2010).
40. Koch, J. T. Tartessian. Celtic in the South-west at the Dawn of History (Aberystwyth, 2009).
41. Renfrew, C. Archaeology and Language: The Puzzle of Indo-European Origins. XIV, 346 S. Ill., Kt(Cape, London, 1988).
42. Nowak, M. Transformations in East-Central Europe from 6000 to3000 BC: local vs. foreign patterns. Documenta Praehistorica XXXIII, 143158 (2006).
43. Malmstrm, H. et al. Ancient DNA reveals lack of continuity between neolithic hunter-gatherers and contemporary Scandinavians. Curr. Biol. 19, 17581762 (2009).
44. Sampietro, M. L. et al. Palaeogenetic evidence supports a dual model of Neolithic spreading into Europe. Proc. Biol. Sci./Royal Soc. 274, 21612167 (2007).
45. Hervella, M. et al. Ancient DNA from hunter-gatherer and farmer groups from Northern Spain supports a random dispersion model for the Neolithic expansion into Europe. PLoS ONE 7, e34417 (2012).
46. Haak, W. et al. Ancient DNA from the rst European farmers in 7500-year-old Neolithic sites. Science 310, 10161018 (2005).
47. Patel, M. & Sive, H. PCR-based subtractive cDNA cloning. Curr. Protocols Mol. Biol.. Chapter 25, Unit 25B, 2 (2001).
48. Tao, S. C., Gao, H. F., Cao, F., Ma, X. M. & Cheng, J. Blocking oligo--a novel approach for improving chip-based DNA hybridization efciency. Mol. Cell. Probes 17, 197202 (2003).
49. Drummond, A. J. et al. Geneious v5.4, Available from http://www.geneious. com/ (2011).
50. R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2010).
51. Excofer, L. & Lischer, H. E. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Resources 10, 564567 (2010).
52. Posada, D. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, 12531256 (2008).
53. Hammer, O., Harper, D. A. T. & Ryan, P. D. PAST: paleontological statistics software package for education and data analysis. Palaeontologia Electronica 4 (2001).
54. Andrews, R. M. et al. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat. Genet 23, 147 (1999).55. Gouy, M., Guindon, S. & Gascuel, O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol. Biol. Evol. 27, 221224 (2010).
56. Keane, T. M., Creevey, C. J., Pentony, M. M., Naughton, T. J. & McLnerney, J. O. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justied. BMC Evol. Biol. 6, 29 (2006).
57. Minin, V. N., Bloomquist, E. W. & Suchard, M. A. Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics. Mol. Biol. Evol. 25, 14591471 (2008).
58. Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
59. Rambaut, A. & Drummond, A. Tracer v1.4, Available from http:// beast.bio.ed.ac.uk/Tracer (2007).
60. Ho, S. Y. et al. Bayesian estimation of substitution rates from ancient DNA sequences with low information content. Syst. Biol. 60, 366375 (2011).
Acknowledgements
We are indebted to Matt Kaplan and Ryan Spriggs at Arizona Research Laboratories, Division of Biotechnology, University of Arizona Genetics Core Facility, http://uagc.arl.arizona.edu/
Web End =http://uagc. http://uagc.arl.arizona.edu/
Web End =arl.arizona.edu/ , Tyson Clark, Michael Brown, Kristi Spittle and Matthew Boitano (Pacic Biosciences) for sequencing work, Jeremy Timmis for help with DNA sonication protocols, and Robin Skeates, and Hubert Steiner for additional samples and contextual information. We thank the Australian Research Council (grant LP0882622), the Deutsche Forschungsgemeinschaft (Al 287/7-1 and Me 3245/1-1) and National Geographics Genographic Project for funding. M.v.O. was supported in part by the Netherlands Forensic Institute (NFI) and a grant from the Netherlands Genomics Initiative (NGI)/Netherlands Organization for Scientic Research (NWO) within the framework of the Forensic Genomics Consortium Netherlands (FGCN).
Author contributions
P.B., W.H. and A.C. conceived and designed the project. P.B. designed and developed the DNA extraction, DNA library construction and hybridisation-based DNA-capture protocols (with assistance from J.T.). P.B., J.T. and W.H. generated and analysed the data. S.M.R., C.D., R.K. and M.B.v.d.H. contributed experimental steps and C.J.A., J.S., S.Y.W.H., J.K. and K.L. contributed analytical steps. G.B., R.G., S.F., V.D., M.v.O., L.Q., D.M.B., H.M. and K.W.A. provided ancient samples, contextual information, radio-carbon dating and access to critical population data. P.B., W.H. and A.C. wrote the manuscript with input from C.J.A., J.S., S.Y.W.H., S.M.R., J.K. and members of the Genographic Consortium. All authors discussed the paper and gave comments.
Additional information
Accession codes: The complete consensus mt genome sequences have been deposited to NCBI GenBank under accession numbers KC553980 to KC554018.
Supplementary Information accompanies this paper at http://www.nature.com/naturecommunications
Web End =http://www.nature.com/ http://www.nature.com/naturecommunications
Web End =naturecommunications
Competing nancial interests: The authors claim no competing nancial interests associated with this paper.
Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/
Web End =http://npg.nature.com/ http://npg.nature.com/reprintsandpermissions/
Web End =reprintsandpermissions/
How to cite this article: Brotherton, P. et al. Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans. Nat. Commun. 4:1764 doi: 10.1038/ ncomms2656 (2013).
Syama Adhikarla11, Arun Kumar Ganesh Prasad11, Ramasamy Pitchappan11, Arun Varatharajan Santhakumari11, Elena Balanovska12, Oleg Balanovsky12, Jaume Bertranpetit13, David Comas13, Begona Martnez-Cruz13,Marta Mel13, Andrew C. Clarke14, Elizabeth A. Matisoo-Smith14, Matthew C. Dulik15, Jill B. Gaieski15,
10 NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2656 ARTICLE
Amanda C. Owings15, Theodore G. Schurr15, Miguel G. Vilar15, Angela Hobbs16, Himla Soodyall16, Asif Javed17, Laxmi Parida17, Daniel E. Platt17, Ajay K. Royyuru17, Li Jin18, Shilin Li18, Matthew E. Kaplan19, Nirav C. Merchant19,R. John Mitchell20, Colin Renfrew21, Daniela R. Lacerda22, Fabrcio R. Santos22, David F. Soria Hernanz23,R. Spencer Wells23, Pandikumar Swamikrishnan24, Chris Tyler-Smith25, Pedro Paulo Vieira26 & Janet S. Ziegle27
11 The Genographic Laboratory, School of Biological Sciences, Madurai Kamaraj University, Madurai 625 021, Tamil Nadu, India. 12 Research Centre for Medical Genetics, Russian Academy of Medical Sciences, 115478 Moscow, Russia. 13 Institut de Biologia Evolutiva (CSIC-UPF), Departament de Ciences de la Salut i de la Vida, Universitat Pompeu Fabra, 08003 Barcelona, Spain. 14 Department of Anatomy, University of Otago, Dunedin 9054, New Zealand.
15 Department of Anthropology, University of Pennsylvania, Philadelphia, Pennsylvania, 19104-6398, USA. 16 National Health Laboratory Service, Sandringham 2131, Johannesburg, South Africa. 17 IBM, Yorktown Heights, New York 10598, USA. 18 School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai 200433, Peoples Republic of China. 19 Arizona Research Laboratories, University of Arizona, Tucson, Arizona 85721, USA.
20 Department of Genetics, School of Molecular Sciences, La Trobe University, Melbourne, Victoria 3086, Australia. 21 McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, CB2 3ER, UK. 22 Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, CEP 31270-901, Brazil. 23 National Geographic Society, Washington, District of Columbia 20036-4688, USA. 24 IBM, Somers, New York 10589, USA. 25 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. 26 Universidade Federal do Rio de Janeiro, Rio de Janeiro, CEP 21941-901, Brazil. 27 Applied Biosystems, Foster City, California 94494, USA
NATURE COMMUNICATIONS | 4:1764 | DOI: 10.1038/ncomms2656 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 11
& 2013 Macmillan Publishers Limited. All rights reserved.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright Nature Publishing Group Apr 2013
Abstract
Haplogroup H dominates present-day Western European mitochondrial DNA variability (>40%), yet was less common (~19%) among Early Neolithic farmers (~5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this 'real-time' genetic data with cultural changes taking place between the Early Neolithic (~5450 BC) and Bronze Age (~2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (~4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the Bell Beakers expanding out of Iberia in the Late Neolithic (~2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer