Rapid monitoring of biodiversity for conservation, management, or assessing the impact of anthropogenic pressures is frequently difficult to achieve using established methods. This is particularly relevant for fish in lake ecosystems, as no established method is suitable across all lake sizes and depths: electrofishing is unsuitable for large, deep lakes; gillnetting under‐records species restricted to very shallow water and is destructive; and hydroacoustics has low efficacy in shallow lakes and is unable to identify species. Environmental DNA (“eDNA”), which is released by organisms into their environment in the form of shed cells, excreta, gametes, or decaying matter (Taberlet, Coissac, Hajibabaei, & Rieseberg, ), is promising as a complementary or alternative method for monitoring fish in lakes (Civade et al., ; Evans & Lamberti, ; Evans et al., ; Hänfling et al., ; Hering et al., ; Jerde, Mahon, Chadderton, & Lodge, ; Lacoursière‐Roussel, Côté, Leclerc, & Bernatchez, ; Valentini et al., ) and PCR‐based metabarcoding of eDNA has tremendous potential for monitoring entire ecological communities (see, e.g., Bohmann et al., ; Lawson Handley, ; Valentini et al., ; Deiner et al., for reviews).
Although eDNA metabarcoding is still in its infancy, a great deal of progress has been made very recently and a number of studies have demonstrated that it can effectively describe fish communities in lentic (Civade et al., ; Evans et al., ; Hänfling et al., ; Klymus, Marshall, & Stepien, ; Valentini et al., ), lotic (Civade et al., ; Shaw et al., ; Valentini et al., ), and marine environments (e.g., Thomsen et al., ,; Miya et al., ; Port et al., ; Andruszkiewicz et al., ; O'Donnell et al., ; Yamamoto et al., ). eDNA metabarcoding consistently outperforms established methods for detection of fish species (e.g., Thomsen et al., ,; Miya et al., ; Civade et al., ; Hänfling et al., ; Port et al., ; Shaw et al., ; Valentini et al., ; Andruszkiewicz et al., ; Yamamoto et al., ) and is at least semiquantitative, correlating with data from established surveys and providing estimates of (at least relative) abundance (Evans et al., ; Hänfling et al., ; Port et al., ; Andruszkiewicz et al., ; O'Donnell et al., ).
In our previous work, we demonstrated that eDNA metabarcoding has huge potential for describing fish community structure in lakes (Hänfling et al., ). Water samples were collected in January 2015 from Windermere, the largest lake in England, assayed by eDNA metabarcoding of mitochondrial 12S and cytochrome b (CytB) and compared to data from gillnetting and hydroacoustic surveys. Windermere is arguably the most intensively studied lake in the UK, with data on fish populations, physicochemical, and other biological properties collected over many years and regular monitoring of fish populations since the 1940s (Maberly et al., ; Winfield, Fletcher, & James, ). First, 14 of the 16 species ever recorded in Windermere were detected using eDNA compared to only four species detected in an extensive gill‐net survey carried out 4 months prior to eDNA sampling. Interestingly, more species were detected in shallower water, and 12 of the 16 species were detected in just six spatially close shoreline samples. This suggests that eDNA could accumulate at the shoreline and that shoreline sampling could be adequate for detection of most species but more rigorous sampling along the shoreline is needed to investigate this further. Second, depth transects revealed that most species’ eDNA was distributed throughout the water column, but eDNA of the deep‐water species Arctic charr (Salvelinus alpinus) was only detected at the deepest sampling points, indicating that surface water sampling may be ineffective for some species in deep lakes. Third, we found a strong spatial signal in the distribution of eDNA from species that prefer the more mesotrophic conditions of the lake's North Basin, compared to those that are associated with the more eutrophic conditions of the South Basin. This indicates that eDNA provides a contemporary signal, at least to some extent, of the fish distribution, and that eDNA is promising for ecological assessment of water bodies. Moreover, eDNA abundance data consistently correlated with rank abundance estimates from established surveys, demonstrating, together with other studies (e.g., Evans et al., 2015; Thomsen et al., ) that at least semiquantitative estimates could potentially be obtained from eDNA data. Critical questions remain about the spatial and temporal distribution of eDNA in order to better understand the ecology of eDNA and design the most effective strategy for future monitoring programs. For example, (1) how does eDNA distribution vary between seasons, (2) is shoreline sampling more effective than offshore sampling for species detection, and (3) how do abundance estimates from eDNA compare to those from established methods carried out at the same time? We explore each of these questions in this study, by adding data from summer and winter sampling campaigns on Windermere.
There are several reasons why eDNA distribution might vary at different times of the year, including patterns of water mixing, fish behavior and distribution, and different rates of DNA degradation. In our previous study, water samples were collected in winter, when lakes are unstratified and water is extensively mixed in the vertical dimension (Hänfling et al., ). During summer, deeper lakes are stratified and show strong vertical gradients in temperature, while most fish species are also likely to be present and active. Assessing temporal variability is crucial for determining the repeatability of eDNA based methods, but seasonality of eDNA signal has so far been little explored (but see e.g., Sigsgaard et al., ; Tillotson et al., ). Here, we test the hypothesis that there will be a stronger spatial structure of eDNA in the summer compared to winter.
Shoreline sampling is an attractive option for biodiversity monitoring as it avoids the costs, specialist training, access to equipment, and health and safety considerations associated with boat‐based work. To investigate whether shoreline sampling is adequate for detection of most species, we collected samples from the entire perimeter of Windermere and compared shoreline samples to those from offshore transects. We hypothesized that more species will be detected in the shoreline samples, and that fewer samples will be needed for species detection, relative to offshore samples. We also predict this effect will be greatest in the summer, due to greater spatial structure as discussed above.
Obtaining accurate estimates of species abundance and biomass remain arguably the greatest challenge for eDNA applications due to the large number of factors that influence DNA dynamics (Barnes & Turner, ; Barnes et al., ) and the many opportunities for bias during sampling, laboratory, and bioinformatics workflows (Valentini et al., ). In our previous study, we tested the efficacy of both sequence read count and site occupancy (i.e., the proportion of samples in which a species was detected) for assessing relative abundance (Hänfling et al., ). Encouragingly, both measures were significantly correlated with rank abundance, but comparative established survey data was based on historical datasets (up to September 2014) and expert opinion, and further work is needed to determine how robust eDNA is for estimating abundance. To explore this further, and ensure that comparisons between methodologies are as robust as possible, we performed eDNA sampling at the same time as the annual gill‐net survey in September 2015.
In this study, eDNA samples were collected from Windermere along three offshore transects and the entire shoreline in September 2015 and January 2016, and along depth profiles (September only), then data combined with that from January 2015 (Hänfling et al., ) in order to: (1) examine the temporal repeatability of eDNA metabarcoding for lake fish communities across seasons (summer and winter) and years (2015–2016), (2) compare eDNA results with data from gill‐net and hydroacoustic surveys carried out at the same time of sampling, (3) test the hypothesis of greater spatial structure of eDNA in summer compared to winter due to water stratification in summer and breakdown in winter, and (4) robustly compare the effectiveness of shore and offshore sampling locations for species detection.
Windermere is 16.9 km in length, with a surface area of 1,480 ha. The lake is divided into two separate basins: North and South Basin, by a shallow area with islands. North Basin is classed as mesotrophic and has a maximum depth of 64 m. South Basin is more eutrophic and has a maximum depth of 44 m. Lake stratification typically begins in April and persists to November, during which period, the thermocline usually occurs at a depth of between 10 and 20 m.
Gill‐netting surveys were carried out between 1 and 3 September 2015 at five sites (including a surface site directly above a deep‐water bottom site) in each of the two Windermere basins, as described in detail by Winfield et al. (). We previously summarized the fish species presence and abundance for Windermere based on a literature review and IJW's expert opinion (Hänfling et al., ). Each of the 16 previously recorded species was assigned a relative long‐term abundance score ranking from 1 (most common) to 16 (least common, Table S1). The same rank classification is adopted in the present study.
Two sampling events were carried out in Windermere during summer (9–13 September 2015) and winter (26–28 January 2016). Two‐liter water samples, comprised of 5 × 400 ml pooled subsamples, were collected as described in (Hänfling et al., ). Offshore samples were collected from a boat using a Friedinger sampler, along three transects with approximately 1‐km sampling interval between sites, at the same locations sampled in (Hänfling et al., ) (Figure , Table S2). Transect 1 follows the 5‐m depth contour (green dots in Figure , N = 16), transect 2 follows the 20‐m depth contour (red circles, N = 14), and transect 3 follows the lake midline (blue and purple circles, N = 15). This sampling scheme covered seven of the 10 sites that are used for annual gill‐net surveys and samples were also collected at the remaining three gill‐net sites (black triangles, Figure ). Water samples were collected at approximately half the water depth (i.e., nominally in the metalimnion) at each of the offshore sites. In our previous study, depth profiles were collected at 10 m intervals at the deepest points in the North and South Basins in January 2015 (Hänfling et al., ). During the September 2015 sampling, samples were collected at 11 sites (purple circles, which include the two sites sampled in January, Figure ) from the midline transect from the surface (epilimnion), mid depth (metalimnion), and approximately 2 m above the lake bottom (hypolimnion) in order to investigate the effects of stratification on eDNA distribution. The Friedinger sampler was sterilized between samples by washing in a 10% commercial bleach solution (containing ~3% sodium hypochlorite) followed by 10% microsol detergent (Anachem, UK) and rinsed with purified water. Sampling blanks were collected after approximately every eight samples by running 2 L of purified water through the Friedinger sampler after sterilization (N = 9 for September 2015 and 7 for January 2016). Shore samples were collected directly into sterile 2‐L plastic bottles. The 40 shoreline sample sites were approximately 1 km from each other and aligned with the offshore transects as far as this was possible based on accessibility. All samples were stored on ice in a cooler prior to filtration. The total number of samples excluding blanks was 108 (N = 69 offshore and 40 shore) in September 2015 and 87 (N = 47 offshore and 40 shore) in January 2016 in addition to the 78 (72 offshore and 6 shore) samples collected by (Hänfling et al., ).
Distribution of shore and offshore sampling sites in Windermere. Colored dots correspond to the following sample types: red, 20‐m offshore transect; green, 5‐m offshore transect; orange, gill‐net survey sites; blue, midline offshore transect sites; and purple, sites on midline transect where depth profiles were taken in September 2015; yellow, shoreline sites; black triangles, additional sites adjacent to gill‐net survey sites that were separate from the main transects. The gray dashed line on the inset map roughly corresponds to the division between North and South Basins. Sample coordinates are provided in Table S2
Water filtration was carried out at the Freshwater Biological Association laboratories at Windermere, within 8 hr of collection, in a laboratory that does not handle fish. Samples were filtered through 0.45‐μm cellulose nitrate filters and pads (47 mm diameter, Whatman, GE Healthcare, UK) using Nalgene filtration units in combination with a vacuum pump. In a previous study, we demonstrated that 0.45‐μm cellulose nitrate filters are suitable for fish metabarcoding, with low variation and high repeatability between filtration replicates (Li, Lawson Handley, Read, & Hänfling, ). All filtration equipment was sterilized in 10% commercial bleach solution for 10 min, followed by rinsing in 10% microsol and purified water after each filtration. Filtration blanks were run before the first filtration and then after every 10 samples (N = 35), in order to test for possible contamination at the filtration stage. Filters were stored in sterile 50‐mm petri dishes, sealed with parafilm, at −20°C until DNA extraction. DNA was extracted from filters using the PowerWater DNA Isolation Kit (MoBio Laboratories, Carlsbad) following the manufacturer's protocol, including a final elution step in 100 μl.
DNA samples were amplified at two mitochondrial regions: 12S rRNA (12S, 106 bp, Kelly, Port, Yamahara, & Crowder, ; Riaz et al., ) and cytochrome b (CytB, 414 bp, Kocher et al., ) using 16 individually tagged forward primers and 24 individually tagged reverse primers, with one‐step library preparation as described in (Hänfling et al., ) but with minor modifications. PCR reactions contained 0.5 μM each primer, 200 mM dNTPs, 12.5 µl Q5® High‐Fidelity 2X Master Mix (New England Biolabs), and 2.5 μl template DNA. PCR profiles were as follows: 98°C for 30 s followed by 35 cycles of 98°C for 10 s, 58°C (12S)/50°C (CytB) for 15 s, and 72°C for 20 s, and a final extension step of 72°C for 5 min. PCR negative controls included each primer at least once (N = 40). Each PCR reaction was carried out in triplicate and pooled in order to reduce potential bias through stochastic variation during the PCR step. PCR products were checked on ethidium bromide‐stained agarose gels. Each set of samples was normalized for concentration across the samples using the Life Technologies SequalPrep Normalization Plate Kit and subsequently pooled to make a single sequencing library for each assay (12S and CytB). Samples were split across two libraries per locus (hence four libraries in total). Each library was quantified using the Qubit HS DNA Quantification Kit (ThermoFisher) and sequenced on an Illumina MiSeq using V3 2 × 300 bp chemistry at 8 p.m. concentration including a 10% addition of PhiX.
Raw read data for all four libraries have been submitted to NCBI (BioProject: PRJNA482277, SRA Study: SRP154799,
Filtered data were summarized in two ways for downstream analyses: the number of sequence reads per species divided by the total number of reads per sample (normalized read counts, which excludes negative controls) and the proportion of sites occupied by a species (site occupancy). To reduce the possibility of false positives, we only regarded a species as present at a given site if its sequence frequency exceeded a certain threshold level (proportion of all sequence reads in the sample) which was established in (Hänfling et al., ) as 0.1% and 0.2% for 12S and CytB, respectively.
The relationship between eDNA data and data from established surveys (rank abundance by numbers or rank biomass based on long‐term expert opinion, and biomass estimates from September 2015 gill‐net surveys) was investigated by calculating Spearman's Rho (for rank correlations) and Pearson's Product‐moment correlation coefficient (for biomass) in R v3.1.3 (R Development Core Team, ). Data were plotted by fitting a smoothed linear model with the function geom_smooth(model = lm) in ggplot2 (Wickham, ).
The direct comparison between eDNA data and contemporary gill‐net data was based on 10 sites that had complete data for both eDNA and gill‐netting surveys. Only species detected in the gill‐netting surveys were included in this analysis. Normalized read counts per species were calculated by summing the total read count per species for the 10 sites, and dividing by the total read count for these species and sites. Similarly, biomass estimates from gill‐netting data were normalized by dividing the total biomass for a species by the total biomass for all species.
The analyses were repeated for both loci and on different hierarchical levels: (i) all Windermere samples, (ii) basins (North and South), (iii) transects within basins, and (iv) depth profiles within transect to investigate the spatial and temporal variation in eDNA distribution. Finally, sample‐based rarefaction (Gotelli & Colwell, ) was used to determine the number of samples needed to accurately represent the species assemblage. Rarefaction was performed using the functions rich and rarc with 499 randomizations in the R package Vegan v2.4–4 (Oksanen, ) for both loci, shore and offshore samples, and summer and winter.
Gill‐netting surveys detected five species and a single hybrid individual (roach, Rutilus rutilus x common bream, Abramis brama, Table S1). Similar numbers of fish were caught in the North and South Basins (N = 681 and 709 respectively). Perch, Perca fluviatilis, was by far the most abundant species in both basins (N = 517 and 644 in North and South Basins, respectively) in terms of both numbers and biomass (Table S1). Roach was also found in both basins but at higher numbers in the North Basin (N = 161 compared to N = 61 for the South Basin). Atlantic salmon, Salmo trutta (N = 2), and pike, Esox lucius (N = 1), were found in the North Basin only, while common bream (N = 2) was detected at a single site in the South Basin.
Libraries generated between 1.175 and 2.84 Gbp data and had average %≥Q30 scores of 75.40–75.54 (CytB 2.84 Gbp, Q30 75.40, 12S, 1.75 Gbp, Q30 75.54). Sequencing libraries contained on average 18.37 million raw reads (September: 17.81 million for CytB, 30.25 million for 12S; January 5.54 million for CytB, 20.78 million for 12S), of which, an average of 13.46 million reads passed filter (September: 16.62 million for CytB, 19.77 million for 12S; January: 4.62 million for CytB, 11.64 million for 12S). After quality filtering and removal of chimeric sequences, the average read count per sample (excluding controls and samples sequenced for other projects) over all four libraries was 38,124 (average total read counts by library: September: 85,003 for CytB, 32,797 for 12S; January: 22,223 for CytB, 12,474 for 12S). The average number of fish sequences per sample over all four libraries was 30,012 (average fish read counts per library: September: 84,560 for CytB, 20,254 for 12S; January: 10,745 for CytB, 4,488 for 12S). A similar average January fish read count was obtained in our previous dataset from January 2015 (8,219 for CytB, 6,842 for 12S, Hänfling et al., ) indicating lower fish read count in the winter months. Full run metrics are provided in Table S3.
Negligible amounts of contamination were found in the January samples for both loci, with a total of just 19 reads over 30 negative control samples for 12S and 7 reads over 32 samples for CytB. Contamination can therefore be confidently ruled out for these samples. In the September CytB library, 324 reads were detected over 14 PCR negatives and 195 reads were detected in 22 sample and filtration blanks. These reads were almost exclusively assigned to perch, and the maximum number of reads per sample was 47. This indicates a very low level of perch contamination in the September CytB dataset. In the September 12S library, a total of 107 reads was detected over 35 PCR‐negative controls. Roach was detected in 7, perch detected in 6 and minnow, Phoxinus phoxinus, detected in 5 PCR negatives, but the maximum number of reads per sample, per species was just 12, suggesting contamination is negligible at the PCR stage. Of 23 sample/filtration blanks in this library, 9 had zero sequence reads; however, notable evidence of contamination (i.e., sequence reads in the order of 1,000) was found in the remaining 14 blanks. A total of nine species was detected, three of which (tench, Tinca tinca; roach; and brown trout, Salmo trutta) are known to be present in Windermere. It is therefore important to bear in mind that the read counts for these species may be inflated in the actual samples for this library. The other six species detected (Crucian carp, Carassius carassius; gudgeon Gobio gobio; common bleak, Alburnus alburnus; mudminnow, Umbra pygmaeus; common carp, Cyprinus carpio; and chub, Squalius cephalus) have not been recorded by established surveys in Windermere. However common carp and mudminnow were detected with 12S at one site each in our January 2015 sampling (Hänfling et al., ). Because of the ambiguity introduced by this contamination issue, we restrict the results to species that have been previously confirmed in the lake.
Detection of previously recorded species was generally comparable between loci and seasons, with some exceptions. Of the 16 species recorded with established methods (i.e., all species excluding the lampreys: river lamprey, Lampetra fluviatilis and sea lamprey, Petromyzon marinus), 14 were detected in total (Figure and Figures S1 and S2). All of these species were detected in both winter sampling campaigns with 12S. Eight species (perch; roach; Arctic charr; pike; brown trout; eel Anguilla anguilla; bullhead, Cottus gobio; and common bream) were detected with both markers in both basins in all three sampling events (Figure , and Figures S1 and S2). All five species detected in the September 2015 gill‐net survey were detected in the eDNA data.
Species detection in January 2015, September 2015, and January 2016 based on site occupancy, for Windermere North (A) and South (B) Basins. Species are ordered according to their long‐term rank within the basins, with perch the most abundant and rudd the least abundant in both basins. Smoothed curves were fitted with a linear model (see Table for results of correlations)
Some differences were observed between markers (Figure and Figures S1 and S2). Tench and rudd, Scardinius erythropthalmus, were not detected with CytB. Occupancy for some species (e.g., roach) was consistently higher with 12S (Figure a,b and Figures S1 and S2a,b) than with CytB (Figure c,d and Figures S1 and S2c,d). Stone loach, Barbatula barbatula, was detected in all three seasons and both basins with 12S, but was only detected in January 2015 with CytB. In general, there was more consistency in site occupancy between sampling events with 12S than CytB. Note that the occupancy of tench in the September 2015 12S data could be inflated by contamination and should therefore be interpreted with caution.
There were also some notable differences in detection between seasons and between basins (Figure and Figures S1 and S2). For example, detection of some species (e.g., pike, eel) was consistently greater in summer than in winter, while bullhead had higher detection rates in the winter (most notably in North Basin). Rudd, the rarest of the 14 detected species, was only detected in the winter (with 12S in the South Basin). Differences between basins, observed in previous work, were confirmed. In particular, common bream had higher site occupancy in the South Basin compared to North Basin in all seasons (Hänfling et al., ), but was more common in the North Basin and less common in the South Basin in summer compared to winter. Detection of Atlantic salmon was also higher in winter than in summer and in North Basin compared to South Basin.
In spite of the observed differences between loci and seasons discussed above, correlations between eDNA data and long‐term rank were highly consistent between seasons and between loci (Figure and Table ). Of 24 correlations between eDNA data and long‐term rank, 23 were significant (Table ). Similar results were found for both site occupancy and read count (Table ). Spearman's rho and corresponding p values were consistently higher for 12S than for CytB (12S rho = −0.695 to −0.905, p < 0.005; CytB rho = −0.584 to −0.795, p < 0.05).
Results of Spearman's rank correlations between long‐term rank and read count (proportion of total read count) and site occupancySampling event | Locus | Basin | Read count | Site occupancy |
Jan 2015 | 12S | North | rho = −0.710, S = 778, p = 0.006 | rho = −0.758, S = 799.9, p = 0.002 |
Sep 2015 | 12S | North | rho = −0.660, S = 755.33, p = 0.010 | rho = −0.695, S = 771.35, p = 0.006 |
Jan 2016 | 12S | North | rho = −0.612, S = 733.31, p = 0.020 | rho = −0.733, S = 788.58, p = 0.003 |
Jan 2015 | 12S | South | rho = −0.793, S = 816, p = 0.001 | rho = −0.722, S = 783.45, p = 0.004 |
Sep 2015 | 12S | South | rho = −0.818, S = 827.41, p = 0.0003 | rho = −0.905, S = 866.91, p = 8.474e−06 |
Jan 2016 | 12S | South | rho = −0.798, S = 818, p = 0.001 | rho = −0.745, S = 794.12, p = 0.002 |
Jan 2015 | CytB | North | rho = −0.422, S = 647.21, p = 0.132 | rho = −0.584, S = 720.88, p = 0.028 |
Sep 2015 | CytB | North | rho = −0.589, S = 723.18, p = 0.027 | rho = −0.736, S = 789.84, p = 0.003 |
Jan 2016 | CytB | North | rho = −0.536, S = 698.69, p = 0.048 | rho = −0.633, S = 743.18, p = 0.015 |
Jan 2015 | CytB | South | rho = −0.748, S = 795.5, p = 0.002 | rho = −0.777, S = 808.73, p = 0.001 |
Sep 2015 | CytB | South | rho = −0.747, S = 794.75, p = 0.002 | rho = −0.795, S = 816.61, p = 0.0007 |
Jan 2016 | CytB | South | rho = −0.681, S = 764.89, p = 0.007 | rho = −0.707 S = 776.51, p = 0.005 |
CytB: cytochrome b.
Sequence read counts were positively correlated with biomass of the five species detected in the September 2015 gill‐net surveys for both 12S (Pearson's product moment correlation coefficient r = 0.911 t = 3.837, df = 3, p = 0.031, Figure S3a) and CytB (r = 0.935 t = 4.572, df = 3, p = 0.019, Figure S3b).
We noted above that differences in spatial distribution were observed between North and South Basins for species such as common bream. Here, we focus on the comparison of shore, offshore, and depth transects along the entire lake (Figure ) for summer (September) and winter (January). For perch, roach, pike, brown trout, eel, and tench, the distribution of eDNA is uniform between transects, and this observation is repeatable between seasons (Figure ). By contrast strong spatial structuring was observed in the summer for some species. Most notably, Arctic charr was only detected in the offshore transects in the summer, and occupancy increased from the 5 m to midline transect (i.e., with depth), whereas in winter, this species was detected in all four transects (Figure ). The reverse summer pattern was observed for minnow, bullhead, stone loach, and three‐spined stickleback, Gasterosteus aculeatus, which were predominantly detected in the shoreline and shallow transects and not detected in the midline. The winter distribution of these species eDNA was more uniform between transects, with all four species detected in all four transects. Species detection was very similar between the east and west shoreline (which are therefore combined in Figure ), with the exception that three‐spined stickleback were only detected in the east shoreline in summer.
Spatial distribution of eDNA in Windermere for September 2015 and January 2016. Species are ordered according to long‐term rank. Rows correspond to the four transects: shoreline transect, 5‐m transect, 20‐m transect, and midline transect (see Figure for details)
A total of 11 species was detected in the 11 midline transect sites where depth profiles were taken in September 2015 (Figure ). The distribution of eDNA at three different depths showed little difference in site occupancy for perch, roach, pike, brown trout, common bream, and eel. By contrast, minnow, bullhead, and stickleback were only detected in the surface water, and Arctic charr was only found in the midwater and bottom sample. Tench was not detected in the epilimnion (but note the detection of tench in other samples may be influenced by contamination as discussed above). The results are also shown in Figure for the two depth profiles collected in January 2015. Again 11 species were detected, but this time salmon was detected and tench was not. In contrast to the results for September, minnow, bullhead, and stickleback were detected throughout the water column. Arctic charr was again restricted to the bottom and mid lake.
Vertical distribution of eDNA in Windermere from sites sampled at the midline in September (11 sites) and January 2015 (2 sites). Species are ordered according to long‐term rank. Rows correspond to the three transects: surface (epilimnion), mid (metalimnion), and bottom (hypolimnion)
Species accumulation curves based on sample‐based rarefaction plateaued consistently higher for 12S than CytB (Figure ) and curves for shore samples plateaued earlier than for offshore samples in summer (Figure a) but not in winter (Figure b). The 12S offshore and shore curves plateaued at 10 samples for winter, but in summer around 20 offshore samples were needed to detect the same number of species (Figure b). In summer, the 12S shore curve plateaued strongly after 6–10 samples, when 11/14 species (80% of the species diversity) had been captured, whereas the offshore curve continued to increase (Figure a). For CytB, offshore and shore curves also start to plateau around 10 samples in winter, when 8–9 species have been captured (57%–64% of the species diversity, Figure b). In summer, more than 20 shore samples are needed to recover the same number of species detected with just 6 samples sequenced with 12S (Figure a).
Species accumulation curves based on sample‐based rarefaction for Windermere in (a) summer 2015 and (b) winter 2016. Shore (gray) and offshore (black) samples were analyzed separately for 12S (circles) and CytB (diamonds). Shading corresponds to the number of samples needed for optimal species detection
Few studies have so far explored the spatiotemporal variation in eDNA distribution in aquatic environments. Here, we carried out rigorous spatial sampling in England's largest lake over three temporal replicates to determine the level of repeatability in detection and abundance estimation of lake fish species with eDNA metabarcoding. Our analyses demonstrated that species detection and estimation of rank abundance is highly repeatable between seasons, but highlighted some important considerations for design of future fish biodiversity surveys in lakes, which reflect species ecology and seasonal dynamics of aquatic environments.
In our previous study, carried out in winter 2015, 14 of the 16 species confirmed in Windermere using established methods were detected using eDNA (Hänfling et al., ). The same 14 species were detected in winter 2016, and 13 of the species were detected in September 2015, demonstrating strong consistency in species detection across seasons. By comparison, gill‐netting surveys in September 2014 and 2015 found four and five of the most common species respectively (perch, roach, brown trout, pike, in both years, and common bream in 2015). These results add to the growing number of studies that have demonstrated higher detection rates of fish species with eDNA compared to established methods in both freshwater (Valentini et al., ; Civade et al., ; Hänfling et al., ) and marine (Andruszkiewicz et al., ; Miya et al., ; O'Donnell et al., ; Port et al., ; Thomsen et al., ; Yamamoto et al., ) environments.
The only species that were not detected across all sampling campaigns were the river and sea lampreys. We have since detected lamprey eDNA in Windermere and other UK lakes, and can therefore rule out the possibility that our assay is unsuitable. River and sea lamprey are likely to be present in Windermere or its immediate tributaries during September, but they are also likely to be rare and their distributions are probably highly localized due to the very specific lotic habitat requirements of the early life stages of these species (Dawson, Quintella, Almeida, Treble, & Jolley, ; Kelly & King, ). In a study of sea lamprey distribution in tributaries of the Laurentian Great Lakes, Gingera et al. () found that detection by eDNA was high (81%–97%) until spawning finished at the end of June, after which it fell to 6% by mid‐August. Taken together, these factors could explain their non‐detection in the present study. In addition to the lampreys, Rudd, which is the rarest of the 14 species detected with eDNA, and is only present at very low occupancy in South Basin, was not detected during the September sampling. This non‐detection could be due to greater spatial structure in the lake during the summer months, as discussed under “Spatial and seasonal variation in eDNA distribution in Windermere” below.
It has recently been argued that sample pooling reduces the detection probability of fish species (Sato, Sogo, Doi, & Yamanaka, ); however, this is more applicable to studies that pool samples over large spatial scales, and is compensated for in the present study by the high number of samples collected from across the lake. Although the number of false negatives reported here is very low, it might be possible to reduce this even further by increasing the level of replication (Ficetola et al., ). In this study, we pooled replicates at the sampling (5 × 400 ml volumes) and PCR (3 replicates) stages to reduce the risk of false negatives, while allowing us to sequence a large number of samples within a budget. However, sequencing sample replicates separately would allow more accurate estimation of prevalence, detection probability, and false positive and negative rates using full‐site occupancy modeling (Ficetola et al., ). This should be considered for future improvements of the method, but there will obviously be a trade‐off between increasing levels of replication and cost.
Obtaining accurate estimates of abundance from eDNA is thought to be challenging because of the complex dynamics of eDNA in the environment and the large number of opportunities for bias during the experimental work (Barnes et al., ; Lawson Handley, ). This is particularly true for eDNA metabarcoding (compared to species‐specific approaches), in which the number of sequence reads for a particular species can be heavily biased by differential primer binding (primer bias, Deiner et al., ; Elbrecht & Leese, ) and/or subsampling of species during library preparation (Deiner et al., ; Leray & Knowlton, ; Shelton et al., ). However, a growing number of studies have demonstrated significant relationships between abundance estimates generated from eDNA and established data (Hänfling et al., ; Thomsen et al., ). Building on previous work (Hänfling et al., ), we found a consistent, statistically significant relationship between rank abundance (inferred from long‐term established data sets) and eDNA data in the form of both site occupancy and normalized read counts. General trends in relative abundance were highly consistent between seasons. The significant relationships demonstrated here are encouraging, but being able to estimate absolute abundance would be preferable to relative abundance. Normalized read counts were positively correlated with biomass of the five species detected in the September 2015 gill‐net surveys, but this was driven, at least in part, by brown trout and pike with low biomass and read count, and perch with very high biomass and read count. One possible option to improve estimates of abundance, without relying on correlations, is the addition of internal standard DNAs followed by use of a copy number correction (Ushio et al., ). In a recent study of marine fish eDNA, corrected copy numbers were significantly correlated with those obtained by qPCR, providing a promising solution to the low level of confidence in abundance estimation from metabarcoding data (Ushio et al., ).
Even in lentic water bodies, eDNA is predicted to move away from its source via microcurrents, and this is particularly true in large lakes, which are highly dynamic. Seasonal differences in eDNA distribution are expected in large lakes because of differences in the stratification of the water column between winter and summer. We therefore predicted greater spatial structure—both across the lake surface and at different depths—in eDNA distribution in summer compared to winter.
First, based on our previous results, we predicted a difference in species composition between the North and South Basins of Windermere, which differ in their trophic status. Species that are known to prefer less eutrophic conditions (e.g., Arctic charr, Atlantic salmon, brown trout, minnow, and bullhead) were more restricted to the mesotrophic North Basin, while more eutrophic‐tolerant species (common bream, roach, rudd, tench, and eel) were more common in the eutrophic South Basin (Hänfling et al., ). Species that have no clear trophic association were distributed throughout the two basins (stone loach, pike, perch, three‐spined stickleback; Hänfling et al., ). This demonstrates some spatial structuring even in the winter months, which closely reflects the species ecology. The same broad pattern was confirmed in the summer and winter samples obtained here. In addition, one noteworthy observation is that common bream, which are known to prefer the eutrophic conditions of the South Basin, have lower occupancy in the South Basin in summer relative to winter. The reverse is true for the North Basin, suggesting bream may be migrating into the North Basin during summer months. Whether this pattern is observed on a consistent basis, and if so, determining the underlying ecological triggers warrant further investigation.
Second, we predicted a difference in species detection between the shoreline and offshore samples that reflects the species ecology, with greater spatial structuring in the summer months, due to water stratification. eDNA from species from our earlier study (Hänfling et al., ) are expected to be widely distributed in the lake (perch, roach, pike, brown trout, and eel) was detected uniformly between transects in both seasons. However, consistent with our prediction, strong spatial structuring was observed in the summer compared to winter for species that are known to have strict habitat preferences. Most notably, Arctic charr—a deep lake species—was only detected offshore in summer, and at much higher occupancy in the midline compared to shallower 5‐m and 20‐m transects. This is consistent with Windermere gill‐net surveys, which never record Arctic charr inshore outside their late autumn and early winter spawning season. The opposite spatial pattern was found for littoral and benthic species (minnow, bullhead, stone loach, and stickleback), which were not detected in the midline transect during the summer and had higher occupancy in the shoreline transect. Three‐spined stickleback eDNA was only found along the east shore of the lake in the summer. Distribution of eDNA was far more uniform in the winter samples, with 12 of the 14 species (aside from tench and rudd) detected in all four transects.
Third, we predicted greater spatial heterogeneity in the vertical transects in the summer because of water stratification, compared to winter. A similar result was found to the horizontal transects discussed above in that in summer, eDNA for species with an expected wide distribution (e.g., perch, roach, pike, brown trout, common bream, and eel) were detected at all three depths, whereas deep‐dwelling Arctic charr was only detected in the midwater and bottom samples, and the more littoral and benthic species were only detected in the surface water. This indicates that eDNA is, to some extent, spatially structured within the water column, and that sampling only surface waters during periods of water stratification could miss deep dwelling species. By contrast, littoral and benthic species were detected throughout the water column in winter. Vertical stratification of eDNA has also been reported in marine environments. For example, in a study of species‐rich coastal waters of Japan, 50% of 128 coastal marine fish species were detected in both surface and bottom samples, whereas the remaining 50% were detected in either surface or bottom samples (Yamamoto et al., ). Similar variation in vertical eDNA distribution has been reported for jellyfish (Minamoto et al., ).
Previous studies have demonstrated that eDNA can persist in the environment over relatively large distances (between approximately 2 and 12 km) in natural river systems (Civade et al., ; Deiner & Altermatt, ), while others have shown eDNA is more patchily distributed in the environment and therefore the likelihood of detecting a target species may decline over short distances between few to hundreds of meters in ponds or small, shallow lakes (Eichmiller, Bajer, & Sorensen, ; Pilliod, Goldberg, Arkle, & Waits, ) or even coastal environments (O'Donnell et al., ; Port et al., ; Yamamoto et al., ). Our results indicate that the distribution of eDNA within a large, deep lake is patchy, but varies between seasons, with greater heterogeneity in the summer months when lake water is less mixed. Further work is needed to investigate the impacts of microhabitats within the lake and the scale of spatial autocorrelation.
Small, but important differences between seasons and transects, as well as between loci, were demonstrated by the sample‐based rarefaction analyses. Previously, based on the winter 2015 sample, we inferred that 10–25 samples detected the majority (≥85%) of the total species detected (Hänfling et al., ). The new results broadly support this estimate, but provide important additional insights. First, there is a clear difference between the loci in terms of the number of species recovered, with greater power for species detection demonstrated by 12S than CytB. Second, species accumulation curves plateaued earlier for winter than summer, suggesting fewer samples may be needed in winter to detect the same number of species. Approximately 10 samples are needed in winter to recover ≥85% of the species detected, whereas in summer, 10 samples recovers only ~70% of the total species detected by each marker. Finally, there is very little difference between offshore and shore sampling in the winter, in terms of the number of species detected. However there is a notable difference in summer for 12S; the shore curve plateaus strongly at approximately six samples, whereas the offshore curve continues to rise. This is consistent with our observations from the transects (i.e., detection of Arctic charr only in deep water during the summer) and indicates that shore sampling only in the summer, may miss species detected at other times of year. To summarize, for our study site, 6–10 shore samples collected in winter and sequenced with 12S are recommended to detect the maximum number of species, with minimum sampling effort.
In summary, we have demonstrated that species detection and estimation of relative abundance of lake fish with eDNA is repeatable between seasons, but there are important spatial and seasonal differences that need to be considered for optimal species detection and abundance estimation. This adds to the growing body of evidence that eDNA is not homogeneously distributed in time or space and can provide an accurate description of aquatic communities (Macher & Leese, ; O'Donnell et al., ; Stoeckle, Soboleva, & Charlop‐Powers, ). To maximize the number of species that can be detected, while minimizing the costs and effort associated with sampling, we recommend shoreline sampling in the winter and sequencing with 12S, since this assay outperformed CytB in terms of species detection. Following this sampling strategy, 6–10 samples are needed to detect the majority of species known to be present in Windermere. However, if abundance estimation is required, it makes more sense to collect as many, spatially representative, samples as possible. Although we found a consistent, significant correlation between rank abundance and eDNA (site occupancy or read count) between seasons, summer sampling, when eDNA is more patchy in distribution, may be preferable (at least in principle) for abundance estimation as site occupancy will more accurately reflect the species presence or absence. The minimum number of samples needed to accurately estimate abundance needs to be explored.
This work was funded by the Scottish Environmental Protection Agency (SEPA, contract JUL213921). HJ was funded through the Higher Education Innovation Fund. DSR was supported by the Natural Environment Research Council award number NE/R016429/1 as part of the UK‐SCAPE programme delivering National Capability. We are very grateful to Alistair Duguid, Willie Duncan, and Sean Morrison from SEPA, and Kerry Walsh and Graeme Peirson from the Environment Agency for support. Ben James and Janice Fletcher helped with the offshore sample collection.
The authors declare no conflict of interest.
LLH, BH, IJW, and DSR designed the research and wrote the paper. LLH, BH, DSR, IJW, HK, HJ, JL, CH, RB, RW, and RD performed the research. AS helped to develop the bioinformatics pipeline.
Raw read data for all four libraries have been submitted to NCBI (BioProject: PRJNA482277, SRA Study: SRP154799,
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Environmental DNA offers great potential as a biodiversity monitoring tool. Previous work has demonstrated that eDNA metabarcoding provides reliable information for lake fish monitoring, but important questions remain about temporal and spatial repeatability, which is critical for understanding the ecology of eDNA and developing effective sampling strategies. Here, we carried out comprehensive spatial sampling of England's largest lake, Windermere, during summer and winter to (1) examine repeatability of the method, (2) compare eDNA results with contemporary gill‐net survey data, (3) test the hypothesis of greater spatial structure of eDNA in summer compared to winter due to differences in water mixing between seasons, and (4) compare the effectiveness of shore and offshore sampling for species detection. We find broad consistency between the results from three sampling events in terms of species detection and abundance, with eDNA detecting more species than established methods and being significantly correlated with rank abundance determined by long‐term data. As predicted, spatial structure was much greater in the summer, reflecting less mixing of eDNA than in the winter. For example Arctic charr, a deep‐water species, was only detected in deep, midlake samples in the summer, while littoral or benthic species such as minnow and stickleback were more frequently detected in shore samples. By contrast in winter, the eDNA of these species was more uniformly distributed. This has important implications for design of sampling campaigns, for example, deep‐water species could be missed and littoral/benthic species overrepresented by focusing exclusively on shoreline samples collected in the summer.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details

1 Evolutionary and Environmental Genomics Group (EvoHull), School of Environmental Sciences, University of Hull, Hull, United Kingdom
2 Centre for Ecology & Hydrology, Crowmarsh Gifford, Oxfordshire, United Kingdom
3 Lake Ecosystems Group, Centre for Ecology & Hydrology, Lancaster Environment Centre, Bailrigg, United Kingdom
4 Evolutionary and Environmental Genomics Group (EvoHull), School of Environmental Sciences, University of Hull, Hull, United Kingdom; Institute of Zoology, University of Graz, Graz, Austria
5 Evolutionary and Environmental Genomics Group (EvoHull), School of Environmental Sciences, University of Hull, Hull, United Kingdom; The Dead‐Sea and Arava Science Center, Tamar regional council, Israel