Content area

Abstract

Aim

Paleohydrological dynamics are well-documented for European river systems, promoting shifting phases of isolation and connectivity of their aquatic fauna. These conditions coupled with high rates of hybridisation found in freshwater fishes may introduce considerable complexity and potential mito-nuclear discordance of phylogenetic patterns. We evaluate this hypothesis using the first large-scale analysis of nuclear SNPs in European species of grayling (Thymallus) compared to mtDNA data with the aim of reassessing the evolutionary history of this group of rheophilic fishes.

Location

Freshwater systems in Europe.

Methods

Based on mitochondrial (mitogenomes, control region) and nuclear (ddRADseq) data, we applied population-genetic, phylogenetic, and biogeographic tools to evaluate lineage diversity in the context of paleohydrological alterations.

Results

The results corroborated previously recognised high levels of lineage diversity, but revealed several cases of mito-nuclear discordance and signals of both historical (natural) and human-mediated introgression among major inter- and intraspecific lineages of Thymallus in Europe. A time-calibrated phylogeny and ancestral area estimation, based on nuclear SNP data, supported a late Pliocene diversification of the genus in Europe and suggested an early colonisation of the Black Sea basin with subsequent dispersal into Central and Western Europe.

Main Conclusions

The genetic structure of Thymallus in Europe recovered by nuclear SNPs contrasts considerably with that supported by mtDNA. Several instances of mito-nuclear discordance underscore frequent contact of allopatric lineages in a dynamic paleohydrological landscape and reveal the weakness of basing both taxonomic and conservation decisions on inferences based on mtDNA alone. The Danube and Rhine drainages were inferred as important zones of contact between divergent phylogeographic lineages. Additionally, our data cast doubt on the genetic integrity of the endangered T. aeliani. Its divergence from T. thymallus, using nuclear SNPs, appears minimal as samples of T. aeliani group within Danubian lineages, despite carrying highly divergent reciprocally monophyletic mtDNA.

Full text

Turn on search term navigation
INTRODUCTION

Climatic oscillations in the Pleistocene were major drivers of the evolution and distribution of biota in the temperate and arctic regions of Europe (Hewitt, 1999; Weiss & Ferrand, 2002). The evolutionary history of most organism groups, however, cannot be explained by Pleistocene climatic changes alone. Fossil records support the presence of many present-day taxonomic groups in Europe since the late Miocene and Pliocene (e.g. Kovács et al., 2020; NOW Community, 2022), when both the landscape topography and major drainage systems were largely established (Popov et al., 2004). Pleistocene dynamics led, however, to frequent and significant regression or expansion of range boundaries, following climatic conditions (Hewitt, 2004). These periodic changes introduced considerable complexity in distribution ranges and refugial areas (e.g. Horreo & Fitze, 2018; Sommer & Zachos, 2009; van Rensburg et al., 2021; Weiss & Ferrand, 2002), but also lead to a noticeable parallelism in re-colonisation patterns and zones of secondary contact (Hewitt, 2004).

The (re-)colonisation routes of freshwater fishes are of particular interest because they are dependent on waterway connectivity and thus may reflect drainage system evolution at the catchment scale (Van Steenberge et al., 2020). Changes in river size and flow direction are linked to long-term landscape processes involving tectonic events, hydrogeomorphology, and climate (Coulthard & Van De Wiel, 2012), triggering temporary connectivity between drainage systems and allowing uni- or bidirectional exchange of biota (Bishop, 1995). Allopatric differentiation among drainage systems is well documented for the European ichthyofauna (e.g. Bănărescu, 1992) and described for widely distributed genera such as Phoxinus (Palandačić et al., 2022), Squalius (Seifertová et al., 2012), Telestes (Gilles et al., 2010) and Thymallus (Marić et al., 2014). Regions with dynamically shifting phases of drainage isolation and connectivity promoted high inter- and intraspecific lineage diversity. Such palaeohydrological dynamics are evident in the headwaters of the Danube, Elbe, Rhine, and Rhône rivers (Berger et al., 2005) with evidence for cross-drainage distribution of mitochondrial DNA (mtDNA) haplotypes in Cottus gobio (Englbrecht et al., 2000; Šlechtova et al., 2004), Phoxinus csikii (Palandačić et al., 2017), Thymallus thymallus (Weiss et al., 2002) and Salmo trutta (Lerceteau-Köhler et al., 2013).

Current knowledge on the biogeographic patterns of European freshwater fishes is limited in that many inferences rely on data produced with uni-parentally inherited mtDNA. Despite the advantages of mtDNA, examples of mito-nuclear discordance are common in freshwater fishes (e.g. Campbell et al., 2022; Guo et al., 2019; Secci-Petretto et al., 2023) and often lead to incorrect or incomplete inferences on phylogenetic relationships (Toews & Brelsford, 2012; Wallis et al., 2017). Cyclic periods of retraction, isolation, expansion and contact of distinct species or populations only increase the likelihood of spatially discordant mitochondrial and nuclear patterns (Currat et al., 2008; Dufresnes et al., 2019; Phuong et al., 2017). Such discordance may ultimately have important consequences for evolutionary understanding as well as conservation strategies. Thus, mtDNA-based inferences should be re-examined.

The salmonid genus Thymallus (graylings) is a suitable model to study the interplay between paleohydrology and lineage differentiation on a macro-biogeographic scale (e.g. Secci-Petretto et al., 2023; Weiss et al., 2021). The strong phylogeographic structure of Thymallus in Europe has been repeatedly demonstrated (Gum et al., 2009; Marić et al., 2012, 2014; Weiss et al., 2002). The deep divergence shown across various genetic lineages, with pairwise differences up to 3.8% (based on mtDNA control region (CR)), and the presence of drainage-specific haplogroups were initially suggested to reflect expansions from Pleistocene refugia (Weiss et al., 2002). However, subsequent divergence time estimates (Weiss et al., 2021) and fossil data (Rückert-Ülkümen et al., 2006; Rückert-Ülkümen & Kaya, 1993) indicated that the evolutionary history of Thymallus in Europe may be considerably older than the Pleistocene, potentially predating the Messinian Salinity Crisis at the Miocene/Pliocene border.

Collectively, these observations lead us to hypothesise that the evolutionary history of Thymallus in Europe is more complex than previously suggested from, primarily, mtDNA-based studies. Here, we aim to (1) re-evaluate the inter- and intraspecific genetic structure of the three extant species of grayling in Europe (T. thymallus, T. aeliani, T. ligericus); (2) unravel their colonisation history in the context of major paleohydrological scenarios; and (3) discuss the potential implications that a revised historical or evolutionary understanding may have on conservation and management perspectives. We address these questions by comparing mtDNA data, based on complete mitochondrial genomes (hereafter “mitogenomes”) and CR sequences, with nuclear-genomic markers, obtained from double digest restriction-site associated DNA sequencing (ddRADseq).

MATERIALS AND METHODS Sampling

Thymallus samples were included from all major European drainage systems (Figure 1a). Specific information on sampling localities and number of samples used for each dataset is given in Tables S1–S3. Tissue samples (fin clips) were preserved in 95% ethanol and genomic DNA was extracted using a high salt (ammonium acetate) protocol (Sambrook et al., 1989).

View Image - FIGURE 1. (a) Sampling localities of Thymallus specimens across Europe included in mitochondrial and/or nuclear analyses (sample information in Tables S1–S3). Major lineages are colour coded by 11 pre-defined geographic regions; the Northern-Adriatic (yellow) and Loire (purple) groups correspond to T. aeliani (Adriatic grayling) and T. ligericus (Loire grayling) respectively, and all others to T. thymallus (European grayling). (b) Statistical parsimony haplotype network (95% connection limit) based on 2400 mtDNA CR sequences (959 bp). Circle size reflects haplotype frequency, and colours correspond to geographic regions from which samples were taken. The labelling of geographic regions onto clades in the haplotype network refers to our understanding of the clade's native geographic origin. Mixed colours within these clades thus refer to admixture of haplotypes originating in different geographic regions. Information on species, source, accession numbers and additional information related to published haplotype names are provided in Tables S2 and S4.

FIGURE 1. (a) Sampling localities of Thymallus specimens across Europe included in mitochondrial and/or nuclear analyses (sample information in Tables S1–S3). Major lineages are colour coded by 11 pre-defined geographic regions; the Northern-Adriatic (yellow) and Loire (purple) groups correspond to T. aeliani (Adriatic grayling) and T. ligericus (Loire grayling) respectively, and all others to T. thymallus (European grayling). (b) Statistical parsimony haplotype network (95% connection limit) based on 2400 mtDNA CR sequences (959 bp). Circle size reflects haplotype frequency, and colours correspond to geographic regions from which samples were taken. The labelling of geographic regions onto clades in the haplotype network refers to our understanding of the clade's native geographic origin. Mixed colours within these clades thus refer to admixture of haplotypes originating in different geographic regions. Information on species, source, accession numbers and additional information related to published haplotype names are provided in Tables S2 and S4.

Mitochondrial DNA Mitogenomes

Ten mitogenomes of T. thymallus were sequenced on an Illumina Platform at Novogene (UK, Europe) using a paired-end read mode (150 bp) with approx. 2 Gb per individual. Data were assembled using the custom-made bioinformatic pipeline MitoComp (https://github.com/SamLMG/MitoComp) (see Appendix S1). These 10 new mitogenomes were combined with 17 published sequences, covering all previously known lineages of T. thymallus as well as T. ligericus and T. aeliani together with T. flavomaculatus and T. tugarinae as outgroups (Table S1). Phylogenetic reconstruction, excluding the CR due to several repeat motifs (Weiss et al., 2021), was done in PhyloSuite v.1.2.2 (Zhang et al., 2020) using maximum likelihood (ML) and Bayesian Inference (BI) approaches as described in Englmaier et al. (2022). MEGA v.7.0 (Kumar et al., 2016) was used to calculate uncorrected p-distances between groups.

Mitochondrial CR

The complete dataset comprised 1221 published sequences and 1179 unpublished sequences stemming from contract reports of analyses of all three grayling species in Europe (Table S2). For the latter, the complete mtDNA CR was amplified using the primers LRBT-25 and LRBT-1195 (see Weiss et al., 2013; Appendix S1). Specimens originating from hatcheries and stocked rivers (outside the native range) were not considered, except for samples from a Finnish hatchery representing stocks of T. thymallus from the Oura River. Three haplotypes (AY841359.1, JX961600.1 and MT062993.1) of T. thymallus containing an 82 bp repeat found in various salmonids (Meraner et al., 2013; Sušnik et al., 2006) were discarded.

Sequences were aligned with ClustalW, implemented in BioEdit v.7.2.6 (Hall, 1999), with a final length of 959 bp, excluding the highly variable poly-T region due to its questionable value (Sušnik et al., 2006). Haplotypes were sorted with DnaSP v.6.12.01 (Rozas et al., 2017) and classified into 12 distinct groups, based on previously published data (Table S2). The haplotype network was constructed with TCS v.1.21 (Clement et al., 2000) and visualised in tcsBU (TCS Beautifier; dos Santos et al., 2016) and Inkscape v.0.92.3 (Inkscape Project, https://inkscape.org). MEGA v.7.0 (Kumar et al., 2016) was used to calculate pairwise distances, as well as uncorrected p-distances within and between groups.

Nuclear ddRADseq Library preparation and sequencing

In total, 217 samples, representing all three European species and previously known intraspecific mitochondrial lineages, were included. Preparation of ddRADseq libraries followed Peterson et al. (2012) with modifications described in Secci-Petretto et al. (2023). In brief, 400–500 ng of genomic DNA was used for enzyme digestion using the restriction enzymes SphI and ApoI (New England Biolabs). Each ddRADseq library contained 48 individual samples and a 300 bp size selection (tight range) on a BluePippin (Sage Science). Phusion PCR amplification was done with custom primers following Peterson et al. (2012). Pooled samples were sequenced at the Vienna BioCenter Core Facilities (VBCF, Austria) on Illumina HiSeq 2500 or Illumina NovaSeq 6000 platforms using a 100 bp single-end read mode.

Data processing and assembly

Quality control of raw reads was performed using FastQC v.0.11.8 (Andrews, 2010). Raw reads were demultiplexed, quality-filtered (with default parameters), and assembled using ipyrad v.0.9.81 (Eaton & Overcast, 2020), trimming the first 5 bp of each read. Quality-filtered reads were mapped to chromosomes of the latest reference genome of T. thymallus (ASM434828v1; Sävilammi et al., 2019) to determine homology and identify orthologous sites, ignoring unplaced scaffolds. The percentage of reads that mapped to the reference genome ranged from 95.8% to 98.4% per individual with a mean locus per individual coverage between 1.8× and 79.8×. Samples with a mean coverage <5.0× were excluded (N = 39), retaining 178 individuals for analyses (Table S3). The minimum number of samples per locus was set to 85% (N = 151) to create a stringent dataset, resulting in a total of 19,574 loci and 103,943 single nucleotide polymorphisms (SNPs).

Genetic structure

To determine genetic clusters and visualise inter-cluster differentiation without prior information on the underlying genetic structure, a principal component analysis (PCA) was carried out using the R packages adegenet v.2.1.8 (Jombart, 2008; Jombart & Ahmed, 2011) and ade4 v.1.7-19 (Chessel et al., 2004; Dray & Dufour, 2007) in R v.4.1.3 (R Core Team, 2022). An alternative, genetic model-based evaluation of individual admixture and genetic structure, was provided using fastSTRUCTURE v.1.0 (Raj et al., 2014). Values of K ranging from 1 to 15 were tested by applying a simple prior and 10 replicates each. StructureSelector (Li & Liu, 2018) was used to evaluate the optimal number of clusters. Both analyses were run using a single random SNP per locus.

Phylogenetic analysis

Evolutionary relationships were inferred using a ML approach implemented in IQtree v.2.1.4 (Minh et al., 2020) and based on a concatenated alignment of all 19,574 ddRADseq loci, using T. flavomaculatus and T. tugarinae as outgroups. The best-fit nucleotide substitution model (TVM + F + R2) was identified using ModelFinder (Kalyaanamoorthy et al., 2017). ML phylogenies were estimated with 1000 ultrafast bootstrap replicates, applying the -bnni flag to reduce the impact of overestimating branch support (Hoang et al., 2018).

Historical introgression

Past gene flow between distinct phylogenetic lineages was evaluated based on Patterson's D and related statistics using Dsuite v.0.4 (Malinsky et al., 2021), excluding individuals visually identified as admixed in fastSTRUCTURE analysis (N = 32). Samples from the Sesia River (>94% membership at K = 6 in fastSTRUCTURE) and the Drava and Mur rivers (>96% membership at K = 6 in fastSTRUCTURE) were kept, representing the purest samples from the Northern-Adriatic and the Danubian Southern-Alps lineages in the dataset. Assignment of groups (i.e. lineages) followed phylogenetic and fastSTRUCTURE analyses.

Divergence time estimation and historical biogeography

A time-calibrated phylogeny was inferred under the multi-species-coalescent model and a strict molecular clock implemented in SNAPP v.1.0 (Bryant et al., 2012) an add-on package for BEAST2 v.2.6.6 (Bouckaert et al., 2014). Due to the high computational demands in SNAPP analyses (Stange et al., 2018), the number of individuals was limited to two per group. SNAPP analyses were subsequently performed on 10 random subsets to account for within-lineage variability (see Appendix S1 and Table S6).

Divergence time estimates based on mitogenomes (Weiss et al., 2021) and the oldest Thymallus fossil in Europe (Rückert-Ülkümen & Kaya, 1993) both predate the Pleistocene, with the earliest suggested evidence of the genus in Europe dating back to the Miocene/Pliocene transition (Weiss et al., 2021, Appendix S1). For the time-calibrated analysis of nuclear SNP data, we used two geomorphological calibration points. First, the last direct connection between the Rhône and Danube catchments (Berger et al., 2005: map 19, age classification: 4.2–2.9 Ma) was used to date the split between the ancestral Rhône/Loire and Danube lineages (applying a uniform distribution). Second, based on presumed Pleistocene connections between the Rhône and Loire drainages (Straffin et al., 1999: Figure 3b; see also geological maps of France 1:50,000, e.g. Bonvalot et al., 1984; BRGM, 1999; Donzeau et al., 2001), a calibration point was placed at 1.8–0.5 Ma (uniform distribution) to date the split between Thymallus in the Rhône and Loire rivers.

Ancestral ranges were estimated with the R package BioGeoBEARS (Matzke, 2013) under all available models (DEC and DEC + J, DIVA-like and DIVA-like+J, BAYAREA-like and BAYAREA-like+J). The best-fitting model was selected using the Akaike information criterion (AIC). Eight biogeographic ranges were defined based on present and past hydrographic networks and the current distribution of Thymallus in Europe: (a) Northern-Adriatic basin, (b) Black Sea basin (Danube River), (c) Rhône River, (d) Loire River, (e) North Sea and Southern Baltic Sea basins, (f) Scandinavia (most of the Northern Baltic Sea basin), (g) Barents Sea basin (Pechora River) and (h) Caspian Sea basin (Volga and Ural rivers). Extant Thymallus lineages (across the three species) inhabit a maximum of two of these biogeographic ranges; dispersal between nonadjacent areas was restricted.

Species tree analysis

To account for gene tree/species tree discordance, we performed a coalescent-based tree estimation in ASTRAL v.5.7.1 (Rabiee et al., 2019), using assemblies of the same 10 random subsets as for SNAPP analyses as input. Individual gene/locus trees were obtained using IQtree (Minh et al., 2020) with model selection on each individual alignment using ModelFinder (Kalyaanamoorthy et al., 2017). Resulting gene/locus trees were used as input for ASTRAL to generate a species tree. Gene concordance factors were calculated in IQtree using the same gene/locus trees as for ASTRAL.

RESULTS Mitochondrial DNA Mitogenome phylogeny

Both the ML and BI analyses recovered the same, well-supported topology (Figure S1) as inferred by Weiss et al. (2021). In this new analysis, samples from the Tisza River (Danube tributary in the Romanian Carpathians), analysed here for the first time, were recovered as a sister clade to the Danubian Southern-Alps; 0.9%–2.0% divergent from other Danubian samples.

Mitochondrial CR

Among the 2400 sequences analysed, 125 distinct haplotypes were found (Figure 1b; Table S4), with 35 (28%) haplotypes shared among the major pre-defined geographic regions. The Northern-Adriatic cluster (T. aeliani) was the most variable with a mean within-group divergence of 1.9%. Low within-group variability was found in the North-Eastern European (0.3%) and Scandinavian (0.3%) haplogroups.

Despite the obvious divergence of Loire drainage haplotypes (T. ligericus), the Northern-Adriatic cluster was most divergent overall; mean uncorrelated p-distances ranged from 1.9% to 2.5% (Table S5). Thymallus aeliani haplotypes were restricted to the Northern-Adriatic basin but Danubian haplotypes (T. thymallus) were also found in most Northern-Adriatic localities. Most samples from the Po River drainage formed a subcluster (haplotypes 3–7) within the T. aeliani cluster and only haplotype 8 was shared between the Po and Adige rivers, representing the T. aeliani neotype from Lake Maggiore (MT762347; Bravničar et al., 2020). Danubian haplotypes were strongly structured following major drainage systems (Figure 1b). Samples from the Tisza River revealed a previously unrecognised haplotype (116), the most divergent within all Danubian haplogroups. North Sea basin haplotypes were distinct from Scandinavia and North-Eastern Europe, but grouped with haplotypes from the Rhône River. Scandinavian samples and those originating from the Barents Sea and Caspian Sea drainages were well-differentiated from other T. thymallus in Europe. North-Eastern European haplotypes were exclusively private (haplotypes 44–48, 114, 124); Scandinavian haplotypes were mostly private (22–24, 26, 27, 39), but some were also found in the Rhine (101) and Danube (21, 113).

Nuclear ddRADseq Phylogenetic relationships

ML analysis resolved three major geographic clades although not corresponding to the three European species (Figure 2) (1) the two French rivers, Rhône (T. thymallus) and Loire (T. ligericus); (2) the Danube drainage (T. thymallus) including the Northern-Adriatic basin (T. aeliani); and (3) the North Sea basin and river systems in Scandinavia and North-Eastern Europe (T. thymallus). Substructure within these clades largely followed distinct drainage systems. Within clade 2, five well-supported lineages were inferred. Samples from the Northern-Alps were monophyletic with those from the Tisza River, and samples from the Southern-Alps were grouped with those from the Western-Balkans. Samples from the Northern-Adriatic basin (T. aeliani) were recovered as a sister clade to the Danubian Southern-Alps and Western-Balkans. Within clade 3, samples from the Upper and Lower Rhine were distinct, the latter including samples from the Maas and Elbe rivers, and drainages from England. Samples from Central-Sweden (5399, 5400, 8824, 7303) formed a monophyletic assemblage distinct from Northern-Sweden (LAP009, LAP 010, LAP011), Finland (Ou3-GR65-2), and the western headwaters of the Volga (TIWG550, TIWG552).

View Image - FIGURE 2. Phylogenetic inference and genetic structure of Thymallus in Europe. (a) Maximum Likelihood phylogeny with 178 individuals (sample information in Table S3), based on a concatenated alignment of 19,574 ddRADseq loci with 1000 ultrafast bootstrap replicates; black circles represent bootstrap values >95. (b) fastSTRUCTURE analysis with the optimal number of clusters K = 5 (marginal likelihood) or K = 6 (both median and maximum). Small insert-map showing membership probabilities of fastSTRUCTURE analysis (K = 6) for specific geographic regions.

FIGURE 2. Phylogenetic inference and genetic structure of Thymallus in Europe. (a) Maximum Likelihood phylogeny with 178 individuals (sample information in Table S3), based on a concatenated alignment of 19,574 ddRADseq loci with 1000 ultrafast bootstrap replicates; black circles represent bootstrap values >95. (b) fastSTRUCTURE analysis with the optimal number of clusters K = 5 (marginal likelihood) or K = 6 (both median and maximum). Small insert-map showing membership probabilities of fastSTRUCTURE analysis (K = 6) for specific geographic regions.

Genetic structure and admixture

Both the PCA (Figure S2) and fastSTRUCTURE results (Figure 2b) revealed similar results that align well with the ML phylogeny. For fastSTRUCTURE, the optimal number of clusters was K = 5 (marginal likelihood) or K = 6 (both median and maximum), revealing admixture between populations of T. thymallus, such as between the North Sea basin and the Danubian Northern-Alps, and between the North Sea basin and the Rhône River. Lower levels of admixture were found between the North Sea basin and Scandinavia. At K = 5 samples from the Tisza River revealed low levels of introgression with a consistent signal of admixture between the Danubian Northern-Alps and Western-Balkans/Southern-Alps.

Scenarios of historical introgression

The Dsuite analysis estimated admixture proportions (f4-ratio statistic) above 5% in 53 trios, and the visualisation of the f-branch metric highlighted 27 potential introgression events (Figure 3). These signals of gene flow include not just extant but also ancestral lineages, implying dynamics more ancient than those reflected in the fastSTRUCTURE analysis. The strongest signals of introgression were evident between (ancestral) lineages of the Danube and Rhine rivers, but excess allele sharing was also found between Thymallus in the Rhine, Rhône, and Loire rivers.

View Image - FIGURE 3. Scenarios of historical introgression based on nuclear SNP data, showing results of the f-branch metric. Along the y-axis (branch b) each branch (dotted lines representing ancestral branches) points to a row with corresponding proportions of excess allele sharing between the specific branch and the population/species (x-axis, top); red arrows highlight admixture proportions >5%.

FIGURE 3. Scenarios of historical introgression based on nuclear SNP data, showing results of the f-branch metric. Along the y-axis (branch b) each branch (dotted lines representing ancestral branches) points to a row with corresponding proportions of excess allele sharing between the specific branch and the population/species (x-axis, top); red arrows highlight admixture proportions >5%.

Divergence-time and ancestral area estimation

Time-constrained SNAPP phylogenies identified the same three phylogeographic clades as the ML analysis (Figure 4a). In contrast to the ML inference, however, the SNAPP analyses showed that the placement of T. thymallus clades from the Danubian Northern-Alps and the Tisza River drainage is uncertain and their sister relationship is not strongly supported. An alternative topology inferred from one out of 10 datasets suggested an early split of the Tisza from all other Danubian lineages (Figure S3) questioning the reliability of time estimates for that particular node.

View Image - FIGURE 4. Biogeographic reconstruction for Thymallus lineages/species in Europe. (a) Time-calibrated SNAPP phylogeny (maximum-clade-credibility tree) based on two geomorphological calibration points (A, B), including ancestral area estimation (best-fit DIVALIKE model) from BioGeoBEARS. Divergence times in million years (median values) refer to one (subset1 in Table S6) of nine topologically identical analyses. Grey bars indicate the 95% highest posterior density intervals. (b) Biogeographical ranges and hypothesised dispersal events and major migration routes of Thymallus in Europe: 1 – early colonisation of the Black Sea basin (late Miocene/early Pliocene, suggested from mitogenomes) with subsequent dispersal to the Northern-Adriatic basin (as suggested from mitogenomes), pathways between the Danube and the Northern-Adriatic basin were also possible in later Pleistocene periods; 2 – colonisation of the Rhône drainage via the Danube River (Pliocene); 3 – migration corridors from the Danube to the North Sea basin were possible in both the upper and lower reaches of the Danube drainage (late Pliocene), followed by divergence in the North Sea and the Southern Baltic Sea basins, and the colonisation of North-Eastern Europe (Pliocene/Pleistocene transition); 4 – colonisation of the Loire drainage via the Rhône drainage (Pleistocene); 5 – dispersal from the North Sea and Southern Baltic Sea basins (Pleistocene); 6 – dispersal in North-Eastern Europe (Pleistocene) with secondary contact between the Caspian and Black Sea regions (as suggested from mitogenomes).

FIGURE 4. Biogeographic reconstruction for Thymallus lineages/species in Europe. (a) Time-calibrated SNAPP phylogeny (maximum-clade-credibility tree) based on two geomorphological calibration points (A, B), including ancestral area estimation (best-fit DIVALIKE model) from BioGeoBEARS. Divergence times in million years (median values) refer to one (subset1 in Table S6) of nine topologically identical analyses. Grey bars indicate the 95% highest posterior density intervals. (b) Biogeographical ranges and hypothesised dispersal events and major migration routes of Thymallus in Europe: 1 – early colonisation of the Black Sea basin (late Miocene/early Pliocene, suggested from mitogenomes) with subsequent dispersal to the Northern-Adriatic basin (as suggested from mitogenomes), pathways between the Danube and the Northern-Adriatic basin were also possible in later Pleistocene periods; 2 – colonisation of the Rhône drainage via the Danube River (Pliocene); 3 – migration corridors from the Danube to the North Sea basin were possible in both the upper and lower reaches of the Danube drainage (late Pliocene), followed by divergence in the North Sea and the Southern Baltic Sea basins, and the colonisation of North-Eastern Europe (Pliocene/Pleistocene transition); 4 – colonisation of the Loire drainage via the Rhône drainage (Pleistocene); 5 – dispersal from the North Sea and Southern Baltic Sea basins (Pleistocene); 6 – dispersal in North-Eastern Europe (Pleistocene) with secondary contact between the Caspian and Black Sea regions (as suggested from mitogenomes).

Biogeographic reconstruction suggested that the oldest split among European Thymallus likely occurred between the Black Sea (upper Danubian tributaries) and Rhône biogeographic regions (Figure 4a), followed by the colonisation of the Loire River from the Rhône River at 1.4 Ma (95% highest probability density interval: 1.2–1.7 Ma). The MRCA of extant Danubian lineages was dated at 2.2 Ma (1.8–2.6 Ma), and the MRCA of the North Sea basin, Scandinavian and North-Eastern European lineages was at 2.6 Ma (2.2–3.1 Ma), both falling within the Pliocene/Pleistocene transition. The colonisation of Central and Northern Europe probably originated from the Danube drainage with subsequent divergence into two geographically distinct lineages. Differentiation between the Scandinavian and North-Eastern European lineage was estimated at 0.5 Ma (0.4–0.6 Ma), predating the last glacial maximum at the Pleistocene/Holocene transition.

ASTRAL species tree

The topologies inferred with ASTRAL analyses (Figure S4) were largely congruent with the ML phylogeny in IQtree (based on a concatenated alignment of all ddRADseq loci) and the species-tree inference of SNAPP. One exception to the overall congruence was the variable placement of one sample (Ses3) of T. aeliani. The final quartet scores of the trees ranged from 8,443,449 to 22,373,162 with normalised quartet scores between 0.43 and 0.44, indicating high gene/loci tree discordance. This is also reflected by generally low gene concordance factors (between 0 and 29), which is to be expected due to the extreme shortness of the loci in our datasets.

DISCUSSION

Both mtDNA and nuclear SNP data support high lineage diversity in European Thymallus. This diversity, however, is marked by significant mito-nuclear discordance related to the placement of genetic lineages within and among the three European species (Figure 5), as well as their hypothesised colonisation history. Phylogenetic discordance can result from multiple mechanisms, including incomplete lineage sorting, paralogy, hybridisation and introgression, with or without natural selection (e.g. Antunes & Ramos, 2005; Bonnet et al., 2017; Guo et al., 2019; Wallis et al., 2017). Hybridisation and introgression are particularly common among freshwater fishes (Wallis et al., 2017), and mito-nuclear discordance is being increasingly reported in fishes (Campbell et al., 2022; Guo et al., 2019), including salmonids (Secci-Petretto et al., 2023; Sušnik et al., 2007). The responsible processes, ranging from historical range expansion and colonisation of some of Europe's major river systems to recent anthropogenic activities, involve very different time scales that need to be put into perspective before drawing inferences on both signals of introgression and mito-nuclear discordances.

View Image - FIGURE 5. Mito-nuclear discordance in phylogenetic relationships of European species of Thymallus based on mitochondrial genomes (left, details in Figure S1) and nuclear ddRADseq data (right, details in Figure 2). Species assignment in the mitochondrial tree referring to the current taxonomy as given in Weiss et al. (2021).

FIGURE 5. Mito-nuclear discordance in phylogenetic relationships of European species of Thymallus based on mitochondrial genomes (left, details in Figure S1) and nuclear ddRADseq data (right, details in Figure 2). Species assignment in the mitochondrial tree referring to the current taxonomy as given in Weiss et al. (2021).

Mito-nuclear discordance, colonisation history and paleohydrological scenarios European colonisation and diversification of Thymallus

Fossil evidence (Rückert-Ülkümen & Kaya, 1993) and divergence times estimated from mitogenomes (Weiss et al., 2021; but see Appendix S1 and Figure S5) suggest a stem age of the lineage leading to European species of Thymallus as early as the late Miocene. This rather tentative early colonisation suggested by Weiss et al. (2021) does not necessarily relate directly to any extant lineage of the genus in Europe, but the split between Asian and European species (7.8–10.8 Ma) largely overlaps with major cladogenesis in Pungitius spp. (split between sinensis-clade and pungitius-clade, Guo et al., 2019), and coincides with the oldest fossil record of Hucho in Europe (Kovalchuk, 2015). The European colonisation of Thymallus likely occurred through rivers along the northern Paratethys region, which provided freshwater passage from Central Asia and Siberia to the paleo-Black Sea basin over long periods (Artamonova et al., 2021; Krijgsman et al., 2010; Popov et al., 2004). This is supported by the genus-level phylogeny (Secci-Petretto et al., 2023), showing a sister clade relationship between Central Asian and European taxa. Diversification of Thymallus in Europe probably dates back to the late Pliocene and Pliocene/Pleistocene transition overlapping with insights on the S. trutta complex (Veličković et al., 2023). Our mtDNA and nuclear SNP data support an early colonisation of the Black Sea and Northern-Adriatic basins (Figure 4b) with subsequent dispersal into Western and Central Europe. This is congruent with paleohydrological data (e.g. Berger et al., 2005; Popov et al., 2004) and distribution patterns of other fish species (Bănărescu, 1992).

Danube drainage and the Northern-Adriatic basin

Based on mtDNA data, the earliest split among Thymallus in Europe coincides with the hypothesised colonisation of the Northern-Adriatic basin via the Danube drainage (Figure 4b), dated at 4.0–6.0 Ma from mitogenomes (Figure S5). Connections between the Danube and the Northern-Adriatic basin are supported by phylogeographic patterns in other species (C. gobio, Šlechtova et al., 2004; Squalius cephalus, Seifertová et al., 2012), but commonly refer to the late Pleistocene. However, the strong mito-nuclear discordance observed for T. aeliani (Northern-Adriatic clade) creates significant challenges for understanding the evolutionary history of this species, which is further complicated by human-mediated stocking. Secci-Petretto et al. (2023) showed that Thymallus samples from the Adige and Piave rivers of the Northern-Adriatic basin are nested within Danubian clades and concluded an anthropogenic cause for the observed discordance. While this is plausible for the heavily managed eastern part of the Northern-Adriatic basin (Meraner et al., 2014), a natural influence cannot be completely excluded as no signal of recent admixture with other European lineages was found in the present data, and the elevated D and f4-ratio results rather indicate historical introgression over recent human-mediated admixture. Such high levels of historical introgression and nuclear gene replacement of the original genetic structure of Thymallus in the Northern-Adriatic basin by peri-alpine lineages would suggest previously undetected colonisation/contact during the Quaternary. While our Northern-Adriatic basin samples were selected from the putatively non-managed Sesia River, only two individuals were obtained, undermining the power of our inferences. Nonetheless, the phylogenetic placement of these samples is discordant with the mitogenome analysis and contrasts with the results in Secci-Petretto et al. (2023). Interestingly, even in rivers where most individuals appear to be of admixed origin, divergent Northern-Adriatic mtDNA haplotypes are predominant and not outcompeted by foreign mtDNA strains (Meraner & Gandolfi, 2012; Sušnik et al., 2004). Such patterns may result from genotype-by-environment interactions as recently shown for S. trutta (Folio et al., 2021), supporting a scenario with selection for specific (local) mitochondrial variants, despite nuclear introgression (Toews & Brelsford, 2012, and references therein). While this picture of mito-nuclear discordance in T. aeliani seems to involve both natural and human-mediated introgression, overall, the results cast doubt on the existence of non-introgressed populations of T. aeliani.

The Danube drainage itself is biogeographically complex (Bănărescu, 1992) and the sequence of multiple basins (e.g. Vienna, Pannonian, and Dacian basins), separated by reaches of steeper gradients (e.g. Hainburg Gate, Dunakanyar, Iron Gates), promotes the differentiation of large drainage sub-systems (Domokos et al., 2000; Krézsek & Olariu, 2021: figure 1), possibly shaping strong phylogeographic structure. The newly described clade from the Tisza River tributary of the Danube is concordant with this sub-drainage being isolated from the mainstem Danube throughout the Pleistocene (Borsy, 1992), and mirrors the recent report of clade divergence and secondary contact revealed through genomic analysis of the smooth newt (Lissotriton vulgaris) (Herczeg et al., 2023).

At a broader scale, both mtDNA and nuclear SNPs support complex phylogeographic structure in Danubian T. thymallus and reveal several cases of mito-nuclear discordance. Most striking is the discordant placement of samples from the Western-Balkans and the Austrian Lafnitz River carrying mtDNA closely related to the Scandinavian/North-Eastern European lineage (Figure 5). Speculating on the possible evolutionary mechanisms for this discordance presents difficulties because (1) the lineages in question are closely related to each other, (2) the paleohydrological development of distinct Danubian sub-drainages is complex and far from resolved, and (3) frequent sharing of mtDNA CR haplotypes, especially in the Alpine region, can be indeed explained by decades of human-mediated stocking (Weiss et al., 2013), where source populations originated from within the Danube drainage, but also from the North Sea basin and Scandinavia. The latter explanation, however, appears to be unlikely as the mtDNA split between Thymallus from the Western-Balkans/Lafnitz River and the Scandinavian/North-Eastern European lineage coincides with Pleistocene cold periods, dated at 0.9–1.4 Ma from mitogenomes (Figure S5) and 0.6–1.5 Ma from mtDNA CR (Marić et al., 2014), suggesting natural dispersal between the Caspian and Black Sea regions (Figure 4b).

Loire and Rhône rivers

The sister clade structure of the Rhône (T. thymallus) and Loire (T. ligericus) lineages, inferred from the nuclear phylogeny, was not anticipated due to their distant mtDNA relationship (Figure 5). While the T. ligericus mtDNA lineage is private to the Loire River, many specific CR haplotypes (17 of 40) are shared between the Rhône and Rhine drainages (Figure 1b). Despite stocking activities in this region (Cattanéo et al., 2011; Persat, 1996; Weiss et al., 2002), we reject an anthropogenic cause for the observed discordance because (1) stock management of grayling in the Rhône has been extremely limited compared to other salmonid species such as S. trutta (Cucherousset et al., 2020), or other regions, such as the Austrian Alps (Weiss et al., 2013), and (2) the shared distribution of many tip haplotypes as well as two more ancestral haplotypes (33, 34, Figure 1b) is better explained by ancestral sharing. The temporal resolution of CR haplotypes in salmonids is roughly 100,000 years (based on a 2% per Ma divergence rate), and thus, the revealed sharing does not reflect identical haplotypes at the mitogenome level (e.g. mean p-distance for mitogenomes between the Rhône/Rhine samples was 0.7%); divergence between the Rhône and Rhine lineages was estimated at 0.7–1.2 Ma from mitogenomes (Figure S5). Recent introgression between the Rhine and Rhône populations is also not supported by nuclear SNP data (except for two samples from the Ain and Ognon, Figure 2b) but a relatively strong signal of historical introgression (up to 11% excess allele sharing) from an ancestral Rhine lineage into the Rhône and Loire lineages would suggest a natural scenario of secondary contact. The complete absence of a Loire-related mtDNA lineage in the Rhône drainage can only be explained by its complete replacement (or mitochondrial capture) with Rhine-lineage mtDNA.

The paleohydrology in the Western and Eastern Alps, where the Danube and Rhône rivers and the later established Rhine River originate, involved repeated phases of drainage rearrangements (Berger et al., 2005; Giamboni et al., 2004). This likely provided conditions for the lineage diversification and observed mito-nuclear discordance. The last direct connection between the Danube and the Rhône rivers, via the Aare River, is geologically documented (4.2–2.9 Ma, Berger et al., 2005) and possibly enabled the initial colonisation of the Rhône River (Figure 4b). Faunal exchange in later (Pleistocene) periods could have only occurred indirectly via the Rhine system. Potential Pleistocene corridors between the Rhône and Loire rivers are suggested from the geologically poorly documented “Formations fluvio-lacustres de la Guye” (BRGM, 1999). These sediments are widespread north/north-east of the Massif Central and indicate dynamic fluvial systems with shallow watersheds in the vicinity of backwater lakes in the Saône River valley, which were regularly formed by the advances of the Rhône glacier during Pleistocene cold phases (Persat et al., 2020). Alternatively, a passage via the Rhine/Maas and Seine rivers into the Lower Loire (Weiss et al., 2002) may be hypothesised as faunal similarities between the Lower Rhine and Loire rivers are evident for several freshwater fishes (Bănărescu, 1992). The Rhône/Loire split was presumably followed by a secondary colonisation of Thymallus from the Rhine into the Rhône River reflecting palaeohydrological dynamics in their headwaters throughout the Pleistocene (Persat et al., 2020: figure 2.7).

North Sea basin and Scandinavia/North-Eastern Europe

Thymallus thymallus lineages in Northern Europe are characterised by wider geographical ranges than in Central Europe, less drainage-specific substructure, but a deep divergence between populations from the North Sea basin and Scandinavia/North-Eastern Europe, implying disjunct Pleistocene glacial refugia and postglacial re-colonisation routes (Gum et al., 2009). In Southern and Central Sweden these lineages meet in an area generally known as a hybrid zone for many European taxa (Hewitt, 2004). Early range expansion of Thymallus toward the north/north-east likely originated from the Danube drainage and seems possible in both the upper and lower reaches of the Danube (Figure 4b). In the upper reaches, dispersal to the North and Baltic Sea basins is plausible via river capture events (Domokos et al., 2000). Ichthyofaunal similarities between these regions are documented for several species, but are largely associated with the late Pleistocene (Englbrecht et al., 2000; Lerceteau-Köhler et al., 2013). An alternative route along the Lower Danube and the shallow Dnieper Depression via the “Baltic Gate” may have allowed repeated dispersal opportunities in the paleo-Pripyet region over longer geological periods (Meulenkamp & Sissingh, 2003; Popov et al., 2004). This area denotes the lowest point of the main European watershed and would have provided hydrological connections to both the west (paleo-Weichsel system) and the east/north-east (paleo-Volga system), possibly explaining the early split of Thymallus in the North Sea basin and Scandinavia/North-Eastern Europe.

Significance of introgression and paleohydrology in the evolutionary history of Thymallus in Europe

A general model of European phylogeography was built on the notion that Pleistocene range expansions and contractions promoted strong phylogeographic structure, especially for cold-tolerant aquatic species, due to repeated periods of allopatric isolation (Hewitt, 1999; Weiss & Ferrand, 2002). Although the earliest models hypothesised a few large, southern or eastern glacial refugia, current knowledge supports numerous refugia through large areas of continental Europe, reflecting complex genetic structures (Lerceteau-Köhler et al., 2013; Ruiz-González et al., 2013; Weiss et al., 2002). While evidence of introgression among lineages within species is common (e.g. Gouskov & Vorbuerger, 2016; Perdices et al., 2016), there still may be an underappreciation of the level and complexity of exchange, due to paleohydrological dynamics (especially related to the proglacial hydrological network and meltwater pulses at the end of the glacial stages), pre-Pleistocene colonisation (i.e. involving longer periods of time) and gene–environment interactions, leading to, for example, cases of mito-nuclear discordance. Our dataset on Thymallus conforms well with the overall phylogeographic trends of the European ichthyofauna, supporting numerous glacial refugia but also emphasising serial connectivity and isolation. The generality of our inferences has not yet been explored in depth. What can be emphasised is that gene flow among lineages has occurred repeatedly across the last 2–3 Ma, better reflecting intraspecific population dynamics rather than secondary contact following speciation events. Nonetheless, it is not clear whether the high levels of mito-nuclear discordance shown for Thymallus are common across European freshwater fishes, or more specifically rheophilic, cold-tolerant species (e.g. Cottus spp., Salmo spp.) inhabiting the headwaters, which are more affected by river capture events (Riffle et al., 1995; Riffle & Schreiber, 1995). Comparative genomic studies on European freshwater fishes are needed to explore the extent of mito-nuclear discordance, with the expectation that an increasing number of complex biological mechanisms must be invoked to these discordances and genetic structure, such as found in this study, but also suggested in Lerceteau-Köhler et al. (2013) for S. trutta, and Lucek et al. (2018) for the genus Cottus.

Implications for conservation and stock management

Mito-nuclear discordances can have profound effects on conservation at both the inter- and intraspecific level due to changes or uncertainties in taxonomic assignment, which may confer some form of protection and/or management strategies. Our nuclear SNP data underscore the distinctiveness of the recently described T. ligericus, but no more so than several phylogeographic lineages within T. thymallus. Additionally, the genetic integrity of the critically endangered T. aeliani is open for questioning and in need of further research.

Conservation within the European Union (EU) is strongly driven by the European Habitats Directive (Council Directive 92/43/EEC), which only relates to species listed in Annex II, IV or V of the legislation when it was first enacted in the relevant member state. When a newly described species is an evolutionary derivative of a listed species, then the new species is afforded the same protection status as the original species (Freyhof & Brooks, 2011). Thymallus thymallus is listed in Annex V of the Habitats Directive, which means that it is a species of community interest for which member states are obliged to work toward or maintain a favourable population status. Annex V species differ from other protection categories primarily because commercial use of the species (e.g. sport fishing) is permitted, but nonetheless such activity should be compatible with population status obligations. Thus, in our view, both T. ligericus and T. aeliani should be explicitly listed as species of community interest (Annex V) in the respective countries where they are found, and both harvest regulations and other forms of potential degradation should be considered and potentially adjusted to conform to conservation goals, down to the local level (see Article 14 & 15, Council Directive 92/43/EEC), if necessary.

Overall, our data also draw attention to the paraphyly of T. thymallus, which may lead to taxonomic revision of the genus in Europe. Unfortunately, the high lineage diversity revealed across phylogeographic lineages of T. thymallus, especially in the Danube drainage, does not elicit direct conservation obligations among EU member states. This is because the Council Directive 92/43/EEC, in contrast to similar legislation such as the Endangered Species Act of 1973 in the U.S.A, or Species at Risk Act in Canada, does not allow the designation of intraspecific units to a protection category. Nonetheless, best conservation practices should encourage fisheries managers to recognise the high levels of intraspecific diversity and discourage cross-basin transfers of material for stock management programs. Such transfers, almost always involve hatchery rearing that can undermine population or lineage-specific adaptation and confer a maladaptive response in the target populations (Besnier et al., 2022; Bolstad et al., 2017, 2021).

ACKNOWLEDGEMENTS

We thank Stefania Trasforini and Cesare Puzzi (Graia, Italy), Ulrich Schliewen (ZSM, Germany), Martin Schletterer (TIWAG, Austria), Omar Bašić (Montenegro), Rifat Tahirić and Vlado Kerkez (Bosnia and Herzegovina) and Ramo Berović and Milan Lugić (Serbia) for assistance with samples. Tamara Schenekar (University of Graz) provided valuable comments and support to the study. Fieldwork in Serbia was supported by the Ministry of Natural Resources, Mining and Spatial Planning (permit number 324-04-154/2013-02). Slovenian samples were collected based on permit 3420-30/2017/5 issued by the Ministry of Agriculture, Forestry and Food. Samples from Switzerland were collected within the project “Genetics of grayling populations in Switzerland” (permit number 00.0214.PZ/I284-1246), a cooperation between the Federal Office for the Environment (OFEV, represented by D. Hefti) and Haute école du paysage, d'ingénierie et d'architecture de Genève (Hepia, represented by F. Cattanéo). The authors acknowledge the financial support by the University of Graz.

FUNDING INFORMATION

This work was funded by a grant from the Doctoral Academy Graz, Ecology and Evolution in Changing Environments (EECE), at the University of Graz, to GKE.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflict of interest.

DATA AVAILABILITY STATEMENT

The newly obtained mitochondrial genomes as well as the new mitochondrial control region haplotypes are available on GenBank under accession numbers OR537173–OR537182 and OR393374–OR393398, respectively (Tables S1 and S2). Raw ddRADseq reads have been deposited at NCBI's Sequence Read Archive (BioProject: PRJNA1007909) under accession numbers SAMN37105456–SAMN37105620 (Table S3).

BIOSKETCH

Gernot K. Englmaier is an ichthyologist with a broad interest in phylogeography and taxonomy of freshwater fishes in Eurasia, East and Central Africa. His current post-doctoral research mainly focusses on species diversity of Synodontis catfishes in Lake Tanganyika as part of a project at the Institute of Vertebrate Biology (Czech Academy of Sciences).

Author contributions: Conceived and coordinated the study: GKE, SJW. Laboratory work: GKE. Data curation: GKE, NVR, NB. Data analyses and methodology: GKE, NVR, NB, LZ, DVG, GS-P. Investigation: GKE, NB, HP, SM, CR, BD, EF, SJW. Wrote the first draft of the paper: GKE. Writing – review & editing: All authors contributed to the improvement of the manuscript.

© 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.