Introduction
River temperature () is a key control on the health of aquatic systems (Webb et al., 2008) and is particularly important for the growth, survival and demographic characteristics of cold-water-adapted species such as salmonids (Elliott and Elliott, 2010; Gurney et al., 2008; Jonsson and Jonsson, 2009; McCullough et al., 2001). Rising will influence fish populations by altering the thermal suitability of rivers (Comte et al., 2013; Isaak et al., 2010, 2012). Thus, models that can (1) identify areas most affected by thermal extremes, (2) improve understanding of spatiotemporal variability of thermal regimes, (3) predict the potential effects of climate change and (4) illustrate opportunities for thermal moderation, such as riparian tree planting (Hannah et al., 2008; Hrachowitz et al., 2010), are important for fisheries management. Large-scale models are required to provide information at the spatial scales appropriate to management decisions, i.e. catchment (Chang and Psaris, 2013; Hrachowitz et al., 2010; Imholt et al., 2011, 2013; Jackson et al., 2017b; Steel et al., 2016), regional (Hill et al., 2013; Isaak et al., 2012; Ruesch et al., 2012) and national scales.
Although process-based models provide important mechanistic understanding at small spatial scales, their intensive data requirements prohibit their use at larger scales (Jackson et al., 2016). In contrast, empirical models of rely on observations and explanatory covariates (e.g. altitude or air temperature) which can often be derived remotely at relatively low cost. The development of affordable, reliable, accurate data loggers has led to a rapid increase in monitoring (Sowder and Steel, 2012), to the point that staff time, data storage and quality control are often now the greatest limitations on data collection (Jackson et al., 2016). At the same time, there have been substantial developments in spatial statistical modelling approaches (Ver Hoef et al., 2006, 2014; Ver Hoef and Peterson, 2010; Isaak et al., 2014; Jackson et al., 2017b; O'Donnell et al., 2014; Peterson et al., 2013; Rushworth et al., 2015), monitoring network design (Dobbie et al., 2008; Jackson et al., 2016; Som et al., 2014), spatial datasets (e.g. shapefiles incorporating covariates such as in “The National Stream Internet Project” (Isaak et al., 2011) or gridded air temperature datasets (Perry and Hollis, 2005a, b)) and spatial analysis tools (Isaak et al., 2011, 2014; Peterson et al., 2013; Peterson and Ver Hoef, 2014).
While continuous river temperature data are routinely collected in some areas, resulting in large regional temperature datasets and associated models (e.g. Wehrly et al., 2009; Moore et al., 2013), this is far from universal. In many cases, financial and logistical considerations limit data collection, making it impractical to monitor all rivers. For example, in Scotland, there are 16 006 river catchments (unique rivers running to the sea), including 629 catchments , but there was no systematic nationwide quality-controlled river temperature data collection until 2015 (Jackson et al., 2016). For management purposes, it is therefore often necessary to predict at unmonitored locations, both within (Hrachowitz et al., 2010; Jackson et al., 2017b; Peterson and Urquhart, 2006) and between catchments (Isaak et al., 2014). In recent years, it has become increasingly common to develop and apply spatial statistical river network models that incorporate network covariance structure to predict spatial variability in river temperature (e.g. Isaak et al., 2014; Jackson et al., 2017b). It is widely acknowledged that these models can dramatically improve predictions of river temperature where sufficient observational data exist, but the covariance component of the predictions cannot typically be transferred between catchments. Despite these important issues, there has not yet been an assessment of the transferability of spatial statistical models between catchments; i.e. the ability of a model developed in one catchment to predict in another. This paper investigates the ability of spatial statistical models to predict at unmonitored locations within and between catchments.
The principles explored in this paper are likely to be relevant to other water temperature metrics so, for brevity, this study focuses on maximum summer temperature, a metric which is prevalent in the recent literature, reflecting its importance for the survival of cold-water-adapted fish (Chang and Psaris, 2013; Hrachowitz et al., 2010; Jackson et al., 2017b; Malcolm et al., 2008; Marine and Cech, 2004).
Models are fitted using two sets of covariates. The first set contains landscape covariates which can be generated from readily available spatial datasets and have been the focus of many previous studies of spatial variability in river temperature (e.g. Hrachowitz et al., 2010). Due to increasing interest in the use of air temperature () to predict spatial variability in water temperature (e.g. Jonkers and Sharkey, 2016), the second set contains a metric of air temperature in addition to landscape covariates.
The paper addresses the objectives to
develop statistical models for predicting maximum summer water temperature from landscape covariates in four separate river catchments,
determine whether models containing an air temperature metric explain more of the variation in maximum summer water temperature than those only containing landscape covariates,
assess the transferability of models containing only landscape covariates or both landscape and air temperature covariates between catchments and
produce single models of maximum summer water temperature for all four catchments using both sets of covariates and consider their potential for transferability at larger (e.g. national) scales.
Methodology
Water temperature data and metric
data were obtained from monitoring sites in four catchments: the Bladnoch in western Scotland, and the Dee (Aberdeenshire), Spey and Tweed in eastern Scotland (Fig. 1). These catchments are Special Areas of Conservation for Atlantic salmon and form part of the Scotland River Temperature Monitoring Network (SRTMN) (Jackson et al., 2016). Details of the network, including design and quality-control procedures, are given in Jackson et al. (2016). The catchments all contain an adequate number of data loggers to develop models on a single-catchment basis, with 59, 34, 25 and 19 sites in the Dee, Tweed, Spey and Bladnoch, respectively. The choice of catchments ensured a broad geographic coverage across Scotland with a wide environmental range of landscape covariates (Jackson et al., 2016).
Study catchments and spatial patterns of for August 2015 (Jackson et al., 2017a). (a) Catchment positions in Scotland, (b) the River Bladnoch catchment, (c) River Spey catchment, (d) River Dee catchment and (e) River Tweed catchment. Some features of this map are based on digital spatial data licensed from the Centre for Ecology & Hydrology, © NERC (CEH) and contains Ordnance Survey data © Crown copyright and database right (2016). Catchment boundaries are from SEPA (2009).
[Figure omitted. See PDF]
Data were collected at 15 intervals throughout August 2015. The maximum temperature was calculated for each day and used to produce a 7-day rolling mean of maximum temperatures. The metric of maximum temperatures used in this study ( was the maximum value of this 7-day rolling mean (Jackson et al., 2017a). This metric was preferred to a single observation of , as it characterises the occurrence of sustained high temperatures which are thought to be most ecologically damaging.
Distributions and inter-relationships between and covariates. Scatter plots of the relationships are shown below the diagonal, kernel density plots of the individual covariates in the diagonal (scaled to have the same maximum value) and correlation coefficients above the diagonal. Numbers in black indicate the correlation coefficients where data are pooled across all catchments.
[Figure omitted. See PDF]
Model covariates
Detailed discussion of the landscape covariates and their calculation can be found in Jackson et al. (2016). In brief, the covariates were elevation (Elevation), upstream catchment area (UCA), percentage riparian woodland (%RW), hillshading/channel illumination (HS), channel width (Width), channel gradient (Gradient), channel orientation (Orientation), distance to coast (DC) and distance to the sea along the river (RDS). Table 1 summarises how the covariates were calculated. Before model fitting, gradient, UCA and width were log transformed to reduce skewness, and HS was centred by subtracting the median value from all observations.
An air temperature metric ( was calculated for each site from the gridded UKCP09 dataset (available from the UK MET Office); see Perry and Hollis (2005a, b) for details of this dataset. Analogous to the calculation of , was given by the maximum of the 7-day rolling mean of daily maximum air temperatures in August 2015.
Figure 2 illustrates the distribution and correlation among covariates included in the single- or multi-catchment models (excluding strongly correlated () covariate pairs; see below for details) for each of the four catchments and for the global (four-catchment) dataset.
Modelling
A total of ten models of were developed: two models for each of the four river catchments using either (1) landscape covariates only (LS models) or (2) landscape covariates and (LS_ models) and two models for all four catchments combined, again using either (1) landscape covariates only (multi-catchment LS model) or (2) landscape covariates and (multi-catchment LS_ model). The modelling process differs slightly between the single and multi-catchment models and these are described in turn. All analysis was done in R version 3.2.3 (R Core Team, 2015).
Covariate calculations. All calculations were in R version 3.2.3 (R Core Team, 2015) except where specified.
Covariate | Process and associated packages | Datasets |
---|---|---|
Elevation | “extract” function in the “raster” package (Hijmans, 2015). | OS. Terrain 10 digital terrain model (DTM) Centre for Hydrology and Ecology, Digital Rivers Network (CEH DRN; Moore et al., 1994) |
Gradient | “extract” function in the “raster” package (Hijmans, 2015) to get elevations of the node and a location 1 upstream. The difference in these elevations divided by the length between the two nodes provided gradient. The length upstream was calculated using “SpatialLinesLengths” from “sp” (Pebesma and Bivand, 2005). | OS. Terrain 10 DTM, CEH DRN (Moore et al., 1994) |
Orientation | Standard trigonometry based on the and locations of the node and associated upstream points 1 upstream lengths. The lengths upstream were calculated using “SpatialLinesLengths” from “sp” (Pebesma and Bivand, 2005). | CEH DRN (Moore et al., 1994) |
Upstream catchment area (UCA) | Arc Hydro Tools (ArcGIS 10.2.1) was used to “burn in” the DRN to the DTM and then calculate a UCA raster. | OS. Terrain 10 DTM; CEH DRN (Moore et al., 1994) |
Hillshading/illumination (HS) | “terrain” and “hillShade” functions in the “raster” package (Hijmans, 2015) were used to create a hillshade layer for every hour the sun was above the horizon. These layers were then summed to create a single layer of maximum potential exposure. HS values for the nodes were an average of the raster grid cells in the 1 river polygon. Raster grid cells were weighted by the proportion of the cell within the buffer. | CEH DRN, OS. Terrain 10 DTM; Solar azimuth and altitude values from the US Naval Observatory Astronomical Applications Department (Anon, 2001) |
Percentage riparian woodland (%RW) | The percentage of woodland in a buffer 50 wide and 1 long (upstream) provided %RW. Areas were calculated using “gArea” from “rgeos” (Bivand and Rundel, 2016) and lengths with the “SpatialLinesLengths” from “sp” (Pebesma and Bivand, 2005). | OS MasterMap, CEH DRN (Moore et al., 1994) |
Width | Width was calculated by finding the area classified as water within the 1 upstream and dividing this by the distance upstream. Areas were calculated using “gArea” from “rgeos” (Bivand and Rundel, 2016) and lengths the “SpatialLinesLengths” from “sp” (Pebesma and Bivand, 2005). | OS MasterMap, CEH DRN (Moore et al., 1994) |
Distance to coast (DC) | “gDistance” from the “rgeos” R package (Bivand and Rundel, 2016). | CEH DRN (Moore et al., 1994), OS Panorama coastline |
River distance to sea (RDS) | “shortest.paths” function from the igraph R package (Csardi and Nepusz, 2006). | CEH DRN (Moore et al., 1994), OS Panorama coastline |
Highest 7-day average maximum August ( | Take the value, from daily maximum predicted matrix, for each cell containing a SRTMN site. Use these daily values to calculate rolling averages, then select the highest for each site. | Gridded UKCP09 predicted dataset (UK MET Office, 2015) |
Single-catchment models
The set of covariates was first reduced to avoid problems of collinearity. If two covariates were strongly correlated (Pearson correlation coefficient of more than 0.8) in any one catchment, one of the covariates was dropped from the set available for modelling for all catchments. This ensured all the LS models were based on a common set of covariates (UCA, %RW, HS, Orientation, DC) as were the LS_ models (, UCA, %RW, HS, Orientation, DC).
The relationship between and the covariates was explored using generalised additive models (GAMs) with Gaussian errors and an identity link (Wood, 2001). A “full” model was first fitted, which included all the available covariates from the reduced dataset and a river network smoother (RNS) (see below). The model can be conveniently written in R formula syntax as where is the number of covariates (, 6 for LS, LS_ models, respectively), (covariate denotes that covariate was fitted as a smoother, and an intercept is included by default. The amount of smoothing was estimated from the data (Wood, 2001), with each smoother constrained to have at most 2 degrees of freedom (df) based on the expected simplicity of responses to the covariates. The RNS is included to account for spatial structure in the data that cannot be explained by the covariates. The RNS is a modified version of that developed by O'Donnell et al. (2014), with the amount of smoothness at a confluence controlled by the proportional influence of upstream tributaries weighted by Strahler river order (Strahler, 1957) and fitted using a set of “reduced rank” basis functions; see Jackson et al. (2017b) for full details. The RNS was allowed up to 7 based on knowledge of RNS complexity for the Spey (Jackson et al., 2017b). To ensure the RNS did not account for variability that would otherwise be explained by covariates, RNS basis functions were excluded if they were strongly correlated () with any of the covariates. Thus, base 1 was removed from the Spey and Dee RNSs due to correlations with DC. In the LS_ models, base 2 was also removed from the Spey RNS due to correlation with . The model was fitted by maximum likelihood using the “mgcv” package (Wood, 2016) in R.
All possible subsets of the full model were then fitted. The final model was that with the lowest Bayesian information criterion (BIC) or Akaike information criterion corrected for small sample size (AICc) that contained no terms significant at the 5 level. The choice of information criterion was based on the desire to penalise more complex models that would be unlikely to transfer well (Millidine et al., 2016). Thus, BIC was used for the Dee and Tweed where there were more sites, and AICc was used for the Bladnoch and Spey where there were fewer sites. Terms in the final model with 1 were replaced by linear terms.
In common with similar modelling studies (Hrachowitz et al., 2010; Imholt et al., 2011; Jackson et al., 2017b; Ruesch et al., 2012), no interactions were considered between covariates due to data constraints.
Multi-catchment models
Covariates were excluded if they were strongly correlated () across the entire multi-catchment dataset. The reduced set of covariates was elevation, UCA, %RW, HS, gradient and orientation for the LS model, and , UCA, %RW, HS, gradient and orientation for the LS_ model. The RNS basis functions were the same as those included in the single-catchment models.
A “starting” model was fitted of the following form: where Catchment is a categorical variable allowing a different mean level for each catchment and RNS : Catchment denotes a separate RNS for each catchment. The covariate smoothers were given a maximum of 2 and the RNS a maximum of 7 for each catchment. The model was then refined in a backwards and forwards stepwise procedure which considered (a) replacing smooth covariate effects by linear terms and then dropping them altogether; (b) dropping the RNS by Catchment term altogether; and (c) adding interactions between the covariates (either linear or smoothed) and Catchment. An interaction between a covariate and Catchment would indicate inter-catchment differences in the relationship between and the covariate, suggesting that the model might not transfer well to new catchments. Interactions between the covariates were not considered. Model selection was based on BIC. Finally, any non-significant terms () in the final model were removed.
Model performance and transferability of single-catchment models
The ability of single-catchment models to predict within the catchment they were developed (the donor catchment) was assessed using leave-one-out cross validation. Each site was removed in turn, the final model was refitted, and then was predicted at the missing site using (a) using all model terms (i.e. the covariates and the RNS if present) and (b) only covariates (i.e. excluding the columns in the model matrix relating to the RNS). The prediction using all model terms should outperform that using only covariates because it incorporates the extra information about spatial structure that is captured by the RNS. However, a RNS from one catchment cannot be used to predict in another because the river networks will differ. The prediction using only covariates therefore provides a benchmark for assessing the transferability of models between catchments, since it measures how well a model will transfer to a catchment that is identical in all but its river network.
Transferability to another catchment (the target catchment) was assessed by using the model from the donor catchment to predict at the monitoring sites in the target catchment. As RNSs cannot be transferred, only covariates were used in the predictions (i.e. the columns in the model matrix due to the RNS were ignored).
Three performance metrics were calculated: root mean square error (RMSE) (Eq. 1), which measures overall performance (accuracy), standard deviation (SD) (Eq. 2), which measures how well a model can predict within-catchment spatial variability (precision), and bias (Eq. 3). where and are the observed and predicted at site , and are the mean observed and predicted in the catchment and is the number of sites in the catchment. Standard deviation was used rather than variance, so that all three metrics are on the same scale and can be compared. Model performance was also illustrated by plotting observed against predicted values and comparing this to a line. Points close to the line indicate precise unbiased predictions, points consistently displaced above or below the line indicate biased predictions, and high scatter about the line indicates imprecise predictions. The consequences of predicting outside of the environmental range of a given model were shown by coding sites as “in” or “out” of range.
LS model and LS_ model for each catchment, with the percent variance explained by the model (all terms) and the same model but with the RNS omitted (covariates).
Catchment | Model | AICc/BIC | df | Percent variance | |
---|---|---|---|---|---|
all terms | covariates | ||||
LS model | |||||
Dee | UCA RNS | 137.0 | 8.8 | 80.0 | 18.3 |
Tweed | (UCA) %RW RNS | 100.1 | 7.6 | 85.6 | 71.7 |
Spey | (UCA) RNS | 69.8 | 6.8 | 85.5 | 21.8 |
Bladnoch | (UCA) | 55.5 | 2.9 | 76.6 | 76.6 |
LS_ model | |||||
Dee | UCA (Orientation) RNS | 131.8 | 11.7 | 85.1 | 19.9 |
Tweed | (UCA) %RW (HS) Orientation | 98.5 | 7.7 | 85.3 | 85.3 |
Spey | UCA DC RNS | 69.3 | 7.4 | 85.1 | 19.9 |
Bladnoch | UCA | 53.9 | 4.8 | 85.9 | 85.9 |
Results
Across Scotland, August 2015 was wetter than the 1981–2010 mean (MET Office, 2016) and this was reflected in relatively low . Rainfall in eastern Scotland (which covers the Spey, Dee and Tweed) was 107 of the 1981–2010 mean, whereas rainfall in western Scotland (which covers the Bladnoch) was only 98 of the 1981–2010 mean (MET Office, 2016). Maximum air temperature was the same in eastern Scotland as the 1981–2010 mean maximum and 0.2 C cooler in western Scotland over the same period (MET Office, 2016).
Figure 1 shows the spatial variability in across the four catchments and Fig. 2 summarises the distribution of by catchment (bottom left diagonal panel). Median in the Dee (15.1 ), Tweed (15.6 ) and Spey (15.6 ) was broadly similar, but median in the Bladnoch (16.4 ) was approximately 1 higher (Fig. 2). The range of was 5.7, 5.9, 6.0 and 5.5 in the Bladnoch, Dee, Spey and Tweed, respectively (Fig. 2).
LS model effects with pointwise 95 confidence bands: (a) Dee UCA, (b) Tweed UCA, (c) Spey UCA, (d) Bladnoch UCA and (e) Tweed %RW.
[Figure omitted. See PDF]
Single-catchment models
All four LS models were simple (Table 2), explained much of the variance in (76.6–85.6 ) and contained similar positive relationships between and UCA (Fig. 3). This relationship was near linear until approximately 100 and then levelled off in the Bladnoch (Fig. 3d), and it was smooth, but near linear in the Spey and the Tweed (Fig. 3b, c) and linear in the Dee (Fig. 3a). The magnitude of the effect was similar across catchments at approximately 4 . Three models contained a RNS, which explained much of the variance: 61.7, 13.9 and 63.7 in the Dee, Tweed and Spey, respectively (Table 2). The Tweed model also had a negative linear effect of %RW.
The LS_ models always had a better BIC/AICc than the corresponding LS models but were typically more complex, always required more and only explained a greater percent of the variance in the Bladnoch and the Tweed (Table 2). For the Tweed, the LS_ model used only covariates, whereas the LS model required a RNS to account for unexplained spatial structure. For the Bladnoch, the LS_ model included UCA and , whereas the LS model only included UCA.
In common with the LS models, UCA was in all the LS_ models (Table 2) and the direction, shape and magnitude of the effects were consistent with the LS models (Fig. 4, top row). was in all the LS_ models except the Spey (Table 2). There was a positive linear relationship between and in the Dee and Tweed (Fig. 4e, f) and a U-shaped response in the Bladnoch which is physically implausible, increasingly so when extended beyond the range of observed in the Bladnoch (Fig. 4g). Orientation had a small positive effect on in both the Dee and Tweed (Fig. 4h, i) with higher temperatures for a north–south orientation than an east–west orientation. There was also a negative linear effect of %RW and a positive smoothed effect of HS in the Tweed and a positive linear effect of DC in the Spey (Fig. 4j, k, l, respectively).
Transferability of LS and LS_ models. The values in normal font are for predictions using only covariates (any RNS information is ignored). The values in italics are for predictions when the target and donor catchments are the same and all model terms are used (both covariates and the RNS).
Donor catchment | Target catchment | ||||
---|---|---|---|---|---|
Dee | Tweed | Spey | Bladnoch | ||
LS models | |||||
RMSE | Dee | 1.3 (0.8) | 1.2 | 1.3 | 2.3 |
Tweed | 1.1 | 1.1 (0.9) | 1.3 | 1.9 | |
Spey | 1.3 | 1.3 | 1.4 (0.9) | 2.5 | |
Bladnoch | 2.2 | 1.9 | 2.4 | 0.9 (0.9) | |
Bias | Dee | 0.6 (0.1) | 0.6 | 0.1 | 2.0 |
Tweed | 0.2 | 0.1 (0.0) | 0.3 | 1.7 | |
Spey | 0.6 | 0.8 | 0.2 (–0.0) | 2.3 | |
Bladnoch | 1.9 | 1.4 | 2.0 | 0.0 (0.0) | |
Standard deviation | Dee | 0.8 (0.8) | 1.1 | 1.3 | 1.2 |
Tweed | 1.1 | 1.0 (0.9) | 1.2 | 1.0 | |
Spey | 1.1 | 1.1 | 1.4 (0.9) | 1.0 | |
Bladnoch | 1.1 | 1.3 | 1.3 | 0.9 (0.9) | |
LS_ models | |||||
RMSE | Dee | 1.7 (0.7) | 0.9 | 1.9 | 1.9 |
Tweed | 1.2 | 0.9 (0.9) | 2.3 | 1.5 | |
Spey | 1.5 | 2.0 | 1.8 (0.8) | 4.2 | |
Bladnoch | 8.4 | 4.2 | 5.2 | 0.9 (0.9) | |
Bias | Dee | 1.1 (0.1) | 0.3 | 0.0 | 1.4 |
Tweed | 0.7 | 0.0 (–0.0) | 0.1 | 1.0 | |
Spey | 0.1 | 1.4 | 0.7 (0.0) | 4.1 | |
Bladnoch | 7.6 | 3.2 | 4.1 | 0.1 (–0.1) | |
Standard deviation | Dee | 1.3 (0.7) | 0.9 | 1.9 | 1.2 |
Tweed | 1.0 | 0.9 (0.9) | 2.3 | 1.1 | |
Spey | 1.5 | 1.4 | 1.6 (0.8) | 1.1 | |
Bladnoch | 3.7 | 2.7 | 3.3 | 0.9 (0.9) |
LS_ model effects with pointwise 95 confidence bands. Each column corresponds to a catchment and each row to a covariate. (a) Dee UCA, (b) Tweed UCA, (c) Spey UCA, (d) Bladnoch UCA, (e) Dee , (f) Tweed , (g) Bladnoch , (h) Dee Orientation, (i) Tweed Orientation, (j) Tweed %RW, (k) Tweed HS and (l) Spey DC.
[Figure omitted. See PDF]
LS model transferability. Panels (a), (b), (c) and (d) show predicted when the target catchment is the Dee, Tweed, Spey and Bladnoch, respectively. The colours and symbols indicate the donor catchment: Dee (red circles), Tweed (orange triangles), Spey (dark blue squares) and Bladnoch (light blue diamonds). Filled (open) symbols indicate sites in (out) of the environmental range of the donor catchment. When the target and donor catchments are the same, the coloured points are based on predictions using only covariates; the grey symbols show the corresponding predictions based on the covariates and the RNS. The dashed lines are robust regression lines of observed against predicted values. Models which transfer well have points falling close to the line.
[Figure omitted. See PDF]
LS_ model transferability. Panels (a), (b), (c) and (d) show predicted when the target catchment is the Dee, Tweed, Spey and Bladnoch, respectively. The colours and symbols indicate the donor catchment: Dee (red circles), Tweed (orange triangles), Spey (dark blue squares) and Bladnoch (light blue diamonds). Filled (open) symbols indicate sites in (out) of the environmental range of the donor catchment. When the target and donor catchments are the same, the coloured points are based on predictions using only covariates; the grey symbols show the corresponding predictions based on the covariates and the RNS. The dashed lines are robust regression lines of observed against predicted values. Models which transfer well have points falling close to the line.
[Figure omitted. See PDF]
Multi-catchment LS model effects with pointwise 95 confidence bands: (a) UCA, (b) %RW, (c) Elevation and (d) Catchment.
[Figure omitted. See PDF]
Multi-catchment LS_ model effects with pointwise 95 confidence bands: (a) Dee , (b) Tweed , (c) Spey , (d) Bladnoch , (e) UCA, (f) %RW and (g) Catchment.
[Figure omitted. See PDF]
Transferability of single-catchment models
The transferability of the LS and LS_ models is summarised by their RMSE, bias and standard deviation in Table 3 and illustrated in Figs. 5 and 6, respectively. All the models performed well within catchments (i.e. in the catchments where they were developed) when all model terms (i.e. both covariates and the RNS) were used in the predictions, with a bias of in absolute value and a RMSE of . The LS_ models always had a lower RMSE than the LS models. As expected, within-catchment predictions were poorer when only the covariates were used (excluding RNS), with a median RMSE of 1.2 and a maximum RMSE of 1.8 .
The rest of this section focuses on the predictions, both within and between catchments, using only the covariates. For the catchments in eastern Scotland (Dee, Tweed and Spey), the RMSE, bias and standard deviation of any model were broadly similar whether they were used to predict for the donor catchment or for the other two target catchments. The RMSE of the LS models tended to be lower than that of the LS_ models (median 1.3 and 1.7 , respectively). The LS and LS_ models both had median absolute biases of 0.3 and median standard deviations of 1.1 and 1.4 , respectively. RMSE is a combination of bias and standard deviation, so the RMSE of both sets of models was generally dominated by the standard deviation.
Predictions involving the Bladnoch, either as a donor or target catchment, tended to be poor. The Bladnoch is in western Scotland and was warmer than the other catchments (Fig. 2). The Bladnoch models always overpredicted in the other catchments and the Dee, Tweed and Spey models all underpredicted in the Bladnoch (Figs. 5, 6). This often led to substantial bias and hence RMSE. The Bladnoch LS_ model had the largest biases, which were also due to the implausible relationship with (Fig. 4g). The Dee, Tweed and Spey had reasonable standard deviations when transferred to the Bladnoch (median 1.0 and 1.1 for the LS and LS_ models, respectively) which suggests that, despite having poor RMSE, the models still could be used to predict areas of relatively high or low within the Bladnoch (rather than absolute values of . The same is true of the Bladnoch LS models when transferred to the Dee, Tweed and Spey (median standard deviation 1.3 ). However, the Bladnoch LS_ model had a high standard deviation (median 3.3 ) when transferred to the Dee, Tweed and Spey, again due to the implausible relationship with .
Multi-catchment LS and LS_ models, with the percent variance explained by the model (all terms) and when the RNS is omitted (covariates).
Model | BIC | df | Percent variance | |
---|---|---|---|---|
all terms | covariates | |||
Multi-catchment LS model | ||||
Catchment (UCA) %RW Elevation RNS : Catchment | 379.3 | 24.8 | 84.4 | 51.9 |
Multi-catchment LS_ model | ||||
Catchment : Catchment (UCA) %RW RNS : Catchment | 395.4 | 25.7 | 83.2 | 57.2 |
Multi-catchment models
The multi-catchment LS model included Catchment, UCA, %RW, Elevation and a RNS for each catchment (Table 4). By fitting a single model to all four catchments, it was possible to assess whether covariate effects were consistent across catchments and thus transferable to new catchments or regions. None of the covariates interacted with Catchment. The Catchment effect indicates inter-catchment differences in mean having accounted for the landscape covariates; in particular, higher in the Bladnoch (Fig. 7d). In common with the single-catchment LS models, there was a positive smooth relationship between and UCA with an effect size of approximately 3 (Fig. 7a). There was also a negative linear relationship between and both %RW and Elevation, with effect sizes of approximately 1 and 2 , respectively. The model explained 84.4 of the variance, comparable to the single-catchment LS models. The RNSs explain less of the variance than in the single-catchment models (Tables 3, 4).
The multi-catchment LS_ model explained 83.2 of the variance and contained Catchment, UCA, %RW, and a RNS for each catchment (Table 4, Fig. 8). None of the landscape covariates interacted with catchment. However, the relationship did interact with catchment (Fig. 8a–d), with positive relationships in the Dee and Tweed, and negative (albeit non-significant) relationships in the Spey and Bladnoch. This suggests that relationships with are non-transferable and would not be a good predictor of in new catchments.
Discussion
Even with the introduction of relatively cheap and accurate data loggers, it is not financially or logistically possible to monitor everywhere. Consequently, there is a need to develop models to understand and predict river temperatures at large spatial scales to inform evidence-based management of rivers and fisheries even where extensive local temperature data collection does not exist. Spatial statistical models offer great promise in this respect. However, to date, the transferability of these models has not been considered. This study fitted separate models of to data from four catchments and transferred these models between catchments. Models containing only landscape covariates typically contained similar covariates and covariate responses, and performed better than models containing when transferred between catchments. A physically implausible model transferred particularly poorly. The covariates alone often explained much less of the spatial temperature variability than when a RNS was added but provided the only means of predicting temperature in new catchments with no or limited data (a minimum of 19 loggers was required to fit the full models including covariates and RNS). A single model fitted to all four catchments suggested common responses to landscape covariates but inter-catchment differences in mean temperature and in the relationships between and . These findings are discussed in more detail below.
responses to landscape covariates
The single-catchment LS models contained similar covariates with comparable effect sizes and response shapes which suggested that transferability between catchments could be reasonably successful. This was confirmed by the lack of significant interactions with Catchment in the multi-catchment model. However, when there are inter-catchment differences in mean temperature, the models might only be good predictors of relative values of within a new catchment (i.e. areas of higher or lower rather than absolute values. It is also unclear how well the models would perform in years with differing hydroclimatic characteristics. This study was conducted in a single year with relatively low temperatures and high flows. In a hotter, drier year it might be expected that between-site differences would be greater. Under such circumstances, the current models may not provide accurate predictions of absolute temperatures or inter-site differences without refitting.
All of the responses to landscape covariates (across all models) were physically plausible and hence broadly transferable (Smith et al., 2016). UCA (which was in all the models) is a proxy for discharge, water volume and thermal capacity (Chang and Psaris, 2013; Hannah et al., 2008). Higher UCAs are generally associated with larger water volumes which have a greater thermal capacity, taking longer to warm but also retaining heat for longer (Chang and Psaris, 2013; Imholt et al., 2011). Elevation reflects adiabatic lapse rates which reduce temperatures with increasing altitude (Hrachowitz et al., 2010; Jackson et al., 2017b). The negative relationship between and %RW woodland occurs because riparian shading reduces the amount of incident shortwave radiation reaching the river during daylight hours (Garner et al., 2014; Hannah et al., 2008; Moore et al., 2005). The positive relationship between and HS is consistent with greater in locations with lower shading effects and greater direct shortwave contributions (illumination). was greatest in channels characterised by a north–south orientation which typically experience maximum exposure to incoming radiation (Malcolm et al., 2004). Increasing with distance from the coast reflected continentality and the differing specific heat capacities of land and sea, specifically thermal buffering of relatively cooler sea during summer months (Chang and Psaris, 2013; Hrachowitz et al., 2010).
– relationships
In contrast to the LS models, one LS_ model included a physically implausible relationship that would not be expected to transfer well (Smith et al., 2016). Specifically, this was an inverse modal relationship between and in the Bladnoch model. This relationship could have arisen because of spatial variability in local (temporal) – relationships due to controls that were not captured by our covariates, coupled with a relatively small air temperature range (1.7 ) and sample size (19 sites). Nevertheless, even where the relationships between and were plausible, they were inconsistent between catchments in terms of effect size and direction, as indicated by the varying responses in the single-catchment models and the interaction with Catchment in the multi-catchment model. For example, in the latter, the relationships were positive in the Dee and Tweed, and negative in the Spey and Bladnoch (where the relationship simplifies to linear).
Given the number of previous studies that have predicted from within sites over time (temporal models), between sites (spatial models) or both (e.g. two-stage spatiotemporal models), it may appear surprising that models containing gave poorer predictions of between-catchment temperature variability than those containing landscape covariates alone in this study. However, previous spatial models of incorporating air temperature as a predictor (e.g. Wehrly et al., 2009; Moore et al., 2013) have focused on the ability of these models to predict within the data space (interpolate), while this study investigated the ability of models to predict outside of the data space (extrapolate). Indeed, within our multi-catchment model, it would have been possible to force a single – relationship that reflected an average response across catchments. However, this would result in biased estimates of within individual catchments.
The ability of to predict spatial variability in is likely to degrade where the temporal relationships between and vary spatially, within and between catchments (Kelleher et al., 2012; Segura et al., 2015). It is expected that within-catchment (between-site) variability in the temporal relationships between and would add noise to any spatial relationships, making them harder to detect and reducing the overall precision of any predictions. Systematic differences in – relationships between catchments would result in biased predictions when models are transferred between rivers or regions.
Many studies have shown that within-year temporal relationships between and (often termed thermal or climate sensitivity) can be highly variable between sites and catchments, and importantly this variability relates to regional, hydrological and landscape controls (Tague et al., 2007; Kelleher et al., 2012; Krider et al., 2013; Chang and Psaris, 2013; Hilderbrand et al., 2014; Segura et al., 2015; Mauger et al., 2017). For example, Fellman et al. (2014) observed slopes of between 0.180 and 1.282 across nine watersheds in Alaska depending on glacial influence, while Mauger et al. (2017) observed slopes of between 0.32 and 1.51 across 48 non-glacial streams in Alaska which related to mean elevation and catchment area. Similarly, Tague et al. (2007) observed systematic regional differences in – relationships in western Oregon that depended on local hydrogeology and concluded that under such circumstances air temperature alone (i.e. consistent with a single air temperature coefficient) would be unlikely to explain spatial variation in river temperatures. Given the reported spatial variability in – relationships and, importantly, that these relationships can vary systematically within and between catchments depending on other controls (e.g. hydrogeology), it is unsurprising that models containing do not substantially improve predictions of the spatial variability in than models containing landscape variables alone, and that transferred models result in biased predictions. If is to substantially improve predictions of in static spatial models (such as those presented in this study), then it is likely that they would need to include greater model complexity, e.g. allowing for interactions between and landscape covariates (e.g. Mayer, 2012).
The importance of RNS
The performance of the single-catchment LS and LS_ models in this study compared favourably to regional models of (Moore et al., 2013; Roberts et al., 2013; Wehrly et al., 2009) when predictions were made for the catchment in which the models were developed (i.e. interpolation). For the models that included a RNS, RMSE (0.7–0.9 ) was approximately half that reported by previous studies, although it should be noted that these studies were conducted at considerably larger spatial scales (Moore et al., 2013; Roberts et al., 2013; Wehrly et al., 2009). The RMSE of models without a RNS (0.9–1.8 ) was generally similar or slightly better than reported by other studies.
The landscape covariates included in the models in this study explained large (catchment)-scale trends in but were less successful at explaining variability at finer spatial scales. For example, the approximately 20 variance explained by UCA in the Spey and Dee models is consistent with the 18–25 of variability explained by discharge in Arora et al. (2016). Smaller-scale variability tends to reflect drivers such as water residence time (and heat advection), water sources (Brown et al., 2006; Brown and Hannah, 2008), channel incision, gradient (Jackson et al., 2017b) and land use (Imholt et al., 2013) which are harder to accurately characterise from spatial datasets. In the absence of accurate local-scale characterisation of landscape controls, smaller-scale spatial variability is modelled by the RNS. However, whilst the RNS improves predictions within catchments, it is not transferable, so it does nothing to help predictions between catchments.
Extending predictions
The inclusion of the Catchment main effect in both multi-catchment models showed differences in mean between catchments (that were not accounted for by the covariates). This sometimes led to substantial bias when transferring single-catchment models to new catchments. Accounting for between-catchment differences in mean will be necessary to improve between-catchment predictions of . The multi-catchment models in this study used a simple categorical variable to allow the intercept (and hence mean ) to differ between catchments. However, to predict to new catchments, it would be necessary to extend the modelling approach so that the intercept can be predicted from surrounding catchments. One approach could be to allow the intercept to vary smoothly between catchments using a Gaussian Markov random field (Cressie, 1993), so the intercept in unmonitored catchments could be estimated from nearby monitored catchments. This approach has been developed in other contexts (Millar et al., 2015, 2016) and offers promise in the context of large-scale modelling.
An alternative approach could involve modelling as a function of over shorter time periods (days or weeks) and then allowing this relationship to interact with landscape covariates or location. Such an approach could have additional benefits, allowing the inclusion of temporally incomplete data (e.g. Letcher et al., 2016) or data from temporally inconsistent locations. Where sufficient resources were available it may be possible to supplement the existing network with sites that are monitored for shorter time periods to expand spatial coverage, although the consequences of such deployments for assessing inter-annual temperature variability would need to be investigated. Finally, the development of spatiotemporal models, where temporal variability was driven by or discharge, could potentially allow for forecasting or hindcasting of river temperature which was not possible using the approaches presented in this paper.
Conclusions and future work
This study demonstrated that landscape covariates can explain broad-scale patterns in and that such relationships are transferable between catchments, at least to predict relative levels of . It was necessary to use a RNS to characterise and predict finer-scale spatially correlated variation, so predictions of spatial temperature variability were better within catchments than between catchments. was not a transferable predictor of and could result in poor predictions when the relationship was implausible or transferred outside the range observed in the donor catchment. It would be unwise to use a – relationship to predict spatial variability in without also including meaningful (process-relevant) interactions between and landscape covariates, something that was not possible in this study due to data constraints.
Mean also varied between catchments (having adjusted for the landscape covariates). Future work that looks to predict to new catchments should investigate how to understand and predict these between-catchment differences. A large-scale correlated spatial smoother (e.g. regional effect) offers potential in this respect. Finally, some of the local-scale processes represented in this study (e.g. effect of riparian shading) may benefit from improved characterisation using finer-scale spatial datasets or remotely sensed data. Improved process representation could lead to both better within- and between-catchment model predictions.
The Digital River Network (DRN) used in this study is a
commercial product licensed from the Centre for Ecology and Hydrology, ©NERC.
Data derived from the DRN are therefore also subject to licensing
constraints. As such, these data cannot be made available publicly. It is
possible to view the DRN through the CEH WMS server:
IAM and DMH secured funding for the project. The authors conceived the study. FLJ carried out the data analysis with support from IAM and RJF. FLJ, IAM, RJF and DMH interpreted the results and prepared the paper.
The authors declare that they have no conflict of interest.
Acknowledgements
This project is part of Marine Scotland Science (MSS) Freshwater Fisheries
Laboratories Service Level Agreement FW02G and it is supported by a NERC
Open CASE studentship (University of Birmingham and MSS) awarded to Faye Jackson; grant reference NE/K007238/1. The authors also thank the Scotland
River Temperature Monitoring Network collaborators for their contribution to
data collection (
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2017. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
There has been increasing use of spatial statistical models to understand and predict river temperature (
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details

1 School of Geography, Earth and Environmental Science, University of Birmingham, Birmingham, B15 2TT, UK; Marine Scotland Science, Scottish Government, Freshwater Laboratory, Faskally, Pitlochry, PH16 5LB, UK
2 Marine Scotland, Marine Laboratory, 375 Victoria Road, Aberdeen, AB11 9DB, UK
3 School of Geography, Earth and Environmental Science, University of Birmingham, Birmingham, B15 2TT, UK
4 Marine Scotland Science, Scottish Government, Freshwater Laboratory, Faskally, Pitlochry, PH16 5LB, UK