Hydrologic extremes - an intercomparison of

Full text

Turn on search term navigation

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/ doi:10.5194/hess-20-1483-2016 Author(s) 2016. CC Attribution 3.0 License.

Arelia T. Werner1 and Alex J. Cannon2

1Pacic Climate Impacts Consortium, Victoria, British Columbia, Canada

2Climate Research Division, Environment and Climate Change Canada, Victoria, British Columbia, Canada Correspondence to: Arelia T. Werner ([email protected])

Received: 22 April 2015 Published in Hydrol. Earth Syst. Sci. Discuss.: 26 June 2015 Revised: 4 March 2016 Accepted: 17 March 2016 Published: 19 April 2016

Abstract. Gridded statistical downscaling methods are the main means of preparing climate model data to drive distributed hydrological models. Past work on the validation of climate downscaling methods has focused on temperature and precipitation, with less attention paid to the ultimate outputs from hydrological models. Also, as attention shifts towards projections of extreme events, downscaling comparisons now commonly assess methods in terms of climate extremes, but hydrologic extremes are less well explored. Here, we test the ability of gridded downscaling models to replicate historical properties of climate and hydrologic extremes, as measured in terms of temporal sequencing (i.e. correlation tests) and distributional properties (i.e. tests for equality of probability distributions). Outputs from seven downscaling methods bias correction constructed analogues (BCCA), double BCCA (DBCCA), BCCA with quantile mapping reordering (BCCAQ), bias correction spatial disaggregation (BCSD), BCSD using minimum/maximum temperature (BCSDX), the climate imprint delta method (CI), and bias corrected CI (BCCI) are used to drive the Variable Inltration Capacity (VIC) model over the snow-dominated Peace River basin, British Columbia. Outputs are tested using split-sample validation on 26 climate extremes indices (ClimDEX) and two hydrologic extremes indices (3-day peak ow and 7-day peak ow). To characterize observational uncertainty, four atmospheric reanalyses are used as climate model surrogates and two gridded observational data sets are used as downscaling target data. The skill of the downscaling methods generally depended on reanalysis and gridded observational data set. However, CI failed to reproduce the distribution and BCSD and BCSDX the timing of winter 7-day low-ow events, regardless of re-

Hydrologic extremes an intercomparison of multiple gridded statistical downscaling methods

analysis or observational data set. Overall, DBCCA passed the greatest number of tests for the ClimDEX indices, while BCCAQ, which is designed to more accurately resolve event-scale spatial gradients, passed the greatest number of tests for hydrologic extremes. Non-stationarity in the observational/reanalysis data sets complicated the evaluation of downscaling performance. Comparing temporal homogeneity and trends in climate indices and hydrological model outputs calculated from downscaled reanalyses and gridded observations was useful for diagnosing the reliability of the various historical data sets. We recommend that such analyses be conducted before such data are used to construct future hydro-climatic change scenarios.

1 Introduction

Water resources infrastructure is designed to accommodate hydrologic extremes such as oods and droughts (Cunderlik and Ouarda, 2009; Cunderlik et al., 2004; Ouarda et al., 2006). The frequency and magnitude of extreme hydro-logic events such as oods and droughts have changed with climate and there is broad agreement that changes will continue with projected increases in greenhouse gases (IPCC, 2013). The direction and magnitude of change is not uniform across the globe, but is regionally specic, distinguishable by hydrologic regime and by local changes to temperature and precipitation (Cunderlik and Ouarda, 2009; Monk et al., 2011; Shefeld et al., 2012; Stahl et al., 2010, 2012). For example, in Canada, oods in snowmelt-dominated regimes decreased in magnitude, while oods in rainfall-fed regimes had no signicant trend over 1974 to 2003 (Cunderlik and

Published by Copernicus Publications on behalf of the European Geosciences Union.

1484 A. T. Werner and A. J. Cannon: Hydrologic extremes

Ouarda, 2009). Conversely, Canadian annual low-ow indices showed spatially uniform decreases over 1970 to 2005 (Monk et al., 2011). Thus, future changes in hydrologic extremes need to be estimated at regionally relevant resolutions ( 10 km) and consider both temperature and precipitation

effects.

Global climate models (GCMs) are one of our only tools for projecting the future climate, but they operate at scales too coarse ( 100 km) for use in regional studies. Hence, be

fore projecting changes in hydrologic extremes, some intervening steps are required. Approaches to converting coarse-scale GCM simulations to project changes to peak ows and low ows vary. Some examples include direct down-scaling of streamow extremes by sparse Bayesian learning and multiple linear regression (Joshi et al., 2013), weather generators combined with hydrologic models (Cunderlik and Simonovic, 2007), regional frequency analysis of regional climate model (RCM) projections (Clavet-Gaumont et al., 2013), and, most commonly, statistical downscaling of GCM or RCM projections run through a physically based hydrologic model (Elsner et al., 2010a; Maurer et al., 2010;Schnorbus et al., 2014; Shrestha et al., 2012; Brger et al., 2011). The uncertainty in hydrologic projections from GCMs is greater than that from emissions scenarios or model parameterizations (Bennett et al., 2012; Prudhomme and Davies, 2008) and all GCMs represent the climate imperfectly in different ways (Gleckler et al., 2008; Knutti et al., 2008); therefore, to fully characterize the uncertainty in projected hydrological extremes, an ensemble of GCMs is required.

Gridded statistical downscaling methods provide a computationally efcient and effective means of producing plausible hydro-climatology from a large ensemble of GCMs (Salathe et al., 2007; Salath, 2005; Wood et al., 2004). A number of studies have compared multiple statistical down-scaling methods for use in climatological or hydrological projections. Maurer and Hidalgo (2008) compared constructed analogues (CA) and bias correction spatial disaggregation (BCSD) using the National Centers for Environmental Prediction/National Center for Atmospheric Research Reanalysis I (NCEP1) (Kalnay et al., 1996) as a surrogate GCM. Methods were comparable in producing precipitation and temperature at a monthly and seasonal level, but skilfully downscaled daily data depended on the ability of the climate model to show daily skill. Brger et al. (2012a) compared ve methods for their ability to represent climatic extremes including BCSD and expanded downscaling (XDS).The xed diurnal temperature range in BCSD was seen as a shortcoming in Brger et al. (2012a). XDS performed best, passing 48 % of single tests on average for 27 Climate Indices of Extremes (ClimDEX), with BCSD close behind, passing 45 % (Brger et al., 2012a). Pierce et al. (2013) found that projected increases in annual precipitation versus decreases in California were due to disagreements in the occurrence of the heaviest precipitation days (> 60 mmday1)

amongst three dynamical and two statistical downscaling methods (BCCA and BCSD). Maurer et al. (2010) compared BCSD, BCCA, and CA for their ability to reproduce hydro-logic extremes. BCCA, when combined with the Variable Inltration Capacity (VIC) model, consistently outperformed the other methods in simulating 3-day peak ow and 7-day low ow. BCCA is an improvement over CA because it includes bias correction and over BCSD because it includes daily GCM anomalies (Maurer et al., 2010). An additional method described as statistical downscaling and bias correction (Abatzoglou and Brown, 2012) and as asynchronous regression (Gutmann et al., 2014), both of which interpolate from the GCM to a ne scale and then apply quantile mapping bias correction (i.e. basically reversing the steps of BCSD), was found to reproduce extreme precipitation events at the grid scale but overestimate them on aggregate scales (Maraun, 2013). Studies to date have not assessed the strength of downscaling methods for use with climatic and hydrologic extremes concurrently.

The rst generation National Centers for Environment Prediction/National Center for Atmospheric Research Reanalysis I (NCEP1) reanalysis (Kalnay et al., 1996) is often used as a surrogate GCM when testing downscaling techniques (Brger et al., 2012a; Gutmann et al., 2014; Maurer et al., 2010), primarily because of its long record length.Recently new reanalysis products have come online, bringing to light possible issues with NCEP1, such as a spurious pattern in precipitation elds at high latitudes (Shefeld et al., 2012), and lack of skill in producing daily air temperature at high altitudes versus other reanalyses (Hofer et al., 2012).Reanalyses differ due to variations in assimilated observational data, assimilation methods, representations of surface and boundary layer processes, physics packages, and dynamical cores, and the resulting uncertainty in output elds can be considerable, especially for climatic extremes (Sillmann et al., 2013a). For instance, discrepancies between reanalyses for some climate extreme indices, such as frost days in some regions, are sometimes as large as the typical inter-model spread of the Coupled Model Intercomparison Project ensembles (Sillmann et al., 2013a). These differences arise because near surface temperature and precipitation extremes are calculated from variables that are relatively poorly constrained by observations in reanalyses. Additionally, nonstationarity exists in some reanalysis products because they amalgamate observational data sets from different sources over time (Donat et al., 2014). In the context of historical validation of downscaling methods, statistical downscaling methods may perform poorly simply because reanalysis outputs are not stationary over the calibration and validation periods (Maurer et al., 2013). All of these factors suggest that multiple reanalysis products should be used as GCM surrogates to ensure methods are not failing due to irreparable errors in reanalyses, and also to explore the variability in results due to reanalysis uncertainty.

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1485

Gridded climate observations underpin hydrologic projections. They are used to calibrate the downscaling technique and the hydrologic model, serving as targets and inputs, respectively. Gridded observations are commonly evaluated via comparison with station observations (Hutchinson et al., 2009; Werner et al., 2015), intercomparison with other gridded observations (Eum et al., 2014), or by using them to drive a hydrologic model and comparing outputs to ob-served water balance uxes and streamow over large basins (Livneh et al., 2013; Maurer et al., 2002). We know that statistical downscaling methods perform poorly when nonstationarity occurs between the calibration and validation periods (Maurer et al., 2013), but we have not evaluated how apparent non-stationarity caused by natural climate variability (Huang et al., 2014; Maraun, 2012) is amplied or diminished with methods used to create gridded observations, which could also affect the success of downscaling methods.Furthermore, stationarity in mean annual precipitation and temperature does not dictate stationarity in climatic or hydrologic extremes. Not all, but some, previous studies have included as many years as possible in the calibration, with the goal of maximizing the available historical record available for resampling in the temporal disaggregation step applied in BCSD (Brger et al., 2012a; Salath, 2005; Werner, 2011).This approach is also supported by other studies that found bias correction is more robust for larger samples from longer time series, especially for extremes such as ood events (Huang et al., 2014; Themel et al., 2011). The pros and cons of this extended calibration period have not been fully evaluated. This investigation will help the hydrologic modelling community build a better evaluation system for gridded observations to ensure their strength not only for projections of mean monthly changes over large basins 100 000 km2, but

also for extremes in basins as small as 500 km2.

When used to make climate change projections, distributed hydrologic models such as VIC are best driven with gridded daily data, which is usually produced via gridded statistical downscaling techniques such as BCSD, CA, and BCCA, three gridded methods that have been tested to date.Applying BCSD using minimum and maximum monthly temperature instead of mean monthly temperature has not been tested and may correct some issues with diurnal temperature range (Brger et al., 2012a). It is important to note that the effect of BCSD on daily temperature range (DTR) when used with daily data and ways to ensure minimum temperature is less than maximum temperature has been tested by Thrasher et al. (2012) and is not the focus of this study. A few other methods have been developed recently that warrant investigation. These include double bias corrected constructed analogue (DBCCA), which is similar to BCCA but applies a second quantile mapping bias correction as a post-processing step to correct drizzle and other residual biases (Maurer et al., 2010). Additionally, the climate imprint delta method (CI) (Hunter and Meentemeyer, 2005) and the reverse BCSD (similar to SDBC in Ahmed

et al., 2013, and AR in Gutmann et al., 2014), which we refer to as bias corrected climate imprint (BCCI) due to its use of CI for interpolation, have not been explored for their applicability to hydrology. A recently developed hybrid of BCCA and BCCI, referred to as BCCAQ (Cannon et al., 2015; Murdock et al., 2014), has the potential to be an improvement versus other gridded statistical downscaling techniques and has not been tested with hydrologic extremes.This work will also help to inform use of the resulting BCSD and hydrologic model output provided by the Pacic Climate Impacts Consortium (PCIC; http://www.pacificclimate.org/data

Web End =http://www.pacicclimate. http://www.pacificclimate.org/data

Web End =org/data ). Finally, PCIC also makes available Canada-wide downscaled climate change projections using both the BCSD and BCCAQ methods (http://www.pacificclimate.org/data

Web End =http://www.pacicclimate.org/data ).This study provides the rst rigorous intercomparison of these two methods.

The ClimDEX indices are recommended by the World Meteorological Organization Expert Team on Climate Change Detection and Indices (ETCCDI) (Zhang et al., 2011) as a means of summarizing daily temperature and precipitation statistics, focusing particularly on aspects of climate extremes. They have been developed to allow seamless comparison of climate conditions on an international basis. There are many projects applying the ETCCDI indices to detect changes in extremes historically (e.g. Sill-mann et al., 2013a), to project future changes (e.g. Sill-mann et al., 2013b), and to provide future changes via data portals to allow local analysis (http://www.cccma.ec.gc.ca/data/climdex/

Web End =http://www.cccma.ec.gc.ca/ http://www.cccma.ec.gc.ca/data/climdex/

Web End =data/climdex/ ). Two commonly investigated hydrologic extremes include 3-day peak ow, which represents potential ood conditions, and 7-day low ow, which represents potential drought conditions (e.g. Maurer et al., 2010). Floods can be damaging to river and oodplain infrastructure, while droughts can be detrimental for human water use and aquatic habitat. We follow the framework developed by Brger et al. (2012a), evaluating methods for their abilities in producing the temporal sequencing and distributional properties of climate indices and hydrologic extremes.

The objectives of this study are the following.

1. To compare several reanalyses in the study region against two gridded observation data sets.

2. To test the ability of the BCCA, DBCCA, BCCI, CI, BCSD (mean temperature), BCSDX (minimum and maximum temperature), and BCCAQ downscaling techniques to simulate 26 ClimDEX indices using four reanalyses and two gridded observations.

3. To test the ability of the BCCA, DBCCA, BCCI, CI, BCSD (mean temperature), BCSDX (minimum and maximum temperature), and BCCAQ downscaling techniques when used to force the VIC hydrologic model, to simulate 3-day peak-ow and 7-day low-ow indices using four reanalyses and two gridded observations.

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1486 A. T. Werner and A. J. Cannon: Hydrologic extremes

4. To learn more about the strengths and weaknesses of two gridded observations for use with hydrologic modelling.

5. To see whether the strength of a method to downscale for climate extremes relates to abilities for use with hydrologic extremes.

2 Study area

The Peace River basin will be the focus of this work. The snow-dominated regime of this basin makes the ndings of this work applicable to many mid-latitude areas. The Peace River is located in interior north-eastern BC and encompasses the 101 000 km2 drainage area upstream of Taylor, BC (Fig. 1). Elevations range from 400 to 2800 m. The region is highly inuenced by the Pacic Ocean and Arctic air masses.The region has a continental climate (Demarchi, 1996), with monthly average temperatures ranging from 12.0 C in Jan

uary to 12.3 C in July, averaging 0.2 C. Precipitation follows a seasonal pattern of summer maximum and spring minimum. The Peace River has a nival regime, with approximately 54 % of the annual precipitation (440 mm) falling as snow (mostly during OctoberApril) and 64 % of the natural streamow occurring during the freshet months of MayJuly.Low ows occur during the winter and early spring in head-water (INGEN) and downstream (BCGMS) basins (Fig. 2).Due to the topographical complexity and strong climate gradients this region provides a stringent test of downscaling techniques. Additionally, the Peace River basin is the focus of two studies that explore uncertainty in hydrologic projections, one due to GCMs, emissions scenarios, and parameter sets (Bennett et al., 2012), the other due to statistically versus dynamically downscaled GCMs (Shrestha et al., 2014a).This study provides a good complement to these by exploring new sources of uncertainty in the same basin.

3 Methods

3.1 Gridded observations

Two daily, gridded observational data sets were available over the study area. The rst was generated for BC for application with the Variable Inltration Capacity (VIC) macro-scale distributed hydrologic model following the methods of Maurer et al. (2002) and Hamlet and Lettenmaier (2005).Daily gridded surfaces of minimum and maximum temperature and daily precipitation accumulation were produced at the spatial resolution of 1/16 , which is 6 km2 depend

ing on latitude, for January 1950 to December 2005. Station data were contributed from multiple networks including those of Environment Canada, BC Ministry of Forests, Lands and Natural Resource Operations, BC Hydro, and the US National Weather Service Co-operative Observer Pro-

Figure 1. The Peace River basin (above Taylor, BC) study area analysed for ClimDEX indices (black boundary) and the ve sub-basins investigated for hydrologic extremes, including the Finlay River above the Akie River (FINAK), the Ingenika River above the Swannell River (INGEN), the Parsnip River above the Misinchinka River (PARMS), the Peace River above the Pine River (PEAPN), and the Peace River at Bennett Dam (BCGMS).

0100200300400500600

5th25th % / 75th95th % 25th75th %

Median

INGEN

OCT NOV DEC

JAN FEB MAR

APR MAY JUN JUL

AUG SEP

Discharge ( m3 sec 1)

100 0 100 200

Day of year

02000400060008000

5th25th % / 75th95th % 25th75th %

Median

BCGMS

OCT NOV DEC

JAN FEB MAR

APR MAY JUN JUL

AUG SEP

Discharge ( m3 sec 1)

100 0 100 200

Day of year

Figure 2. Annual daily hydrograph 19851995 for the (top) Ingenika and (bottom) BCGMS hydrometric sites.

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1487

gram, each with a varying range of quality control. Stations were interpolated to grids using the SYMAP inverse-distance weighting algorithm (Shepard et al., 1984). The raw gridded elds were temporally homogenized to remove interpolation artefacts introduced by using a temporally varying mix of stations and corrected for topographic effects using ClimateWNA, a 19611990 PRISM based high-resolution climatology for western North America (Daly et al., 1994;Wang et al., 2006). This data set is referred to as VIC Forcings.

The second data set was created for all of Canada using the Australian National University Spline (ANUSPLIN) implementation of trivariate thin plate smoothing splines (Hutchinson et al., 2009). The Canada-wide ANUSPLIN observational data set was created at a 1/12 grid spacing ( 10 km)

for daily minimum temperature, maximum temperature, and precipitation amounts for the period 19502010 by Hopkinson et al. (2011) and McKenney et al. (2011). Station data from Environment Canada observing sites were interpolated onto the high-resolution grid using the ANUSPLIN smoothing splines with elevation, longitude, and latitude as interpolation predictors. Precipitation occurrence and square-root transformed precipitation amounts were interpolated separately on each day, combined, and transformed back to original units. Observed station data were quality controlled and corrected for station relocation, changes in the denition of the climate day, and trace precipitation amounts.

3.2 Reanalyses

Four atmospheric reanalysis products were selected to span a range of complexity and spatial resolution. Chosen methods include NCEP1, European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis 40 (ERA40), ECMWF Re-Analysis Interim (ERAInt), and the National Oceanic and Atmospheric Administration Cooperative Institute for Research in Environmental Sciences 20th Century Reanalysis V2 (20CR). NCEP1 is a popular reanalysis product applied in the validation of statistical downscaling techniques (Brger et al., 2012a; Maurer et al., 2010). It spans the period from 1948 to the present, is 1.9 in resolution,

and includes a wide range of observations assimilated from ship to satellite data (Kalnay et al., 1996). ERA40 is available from 1958 to 2002 and is archived at the coarsest resolution(2.5 ) of the four products selected for this study. It was the rst to assimilate satellite radiance data directly (Uppala et al., 2005). ERAInt covers the satellite era from 1979 through to the present. Data used here are archived at 1.5 , although the underlying forecast model runs at 0.75 . It has an improved atmospheric model and assimilation system over that used in ERA40 (Dee et al., 2011). 20CR is one of the longest reanalysis records available, starting in 1871 and running to 2012. At 2 resolution it assimilates only surface observations of synoptic pressure, monthly sea surface temperature and sea ice distribution (Compo et al., 2011). Table 1 sum-

Table 1. Availability of gridded observations and reanalyses.

Start End Resolution Reference Reanalysis productNCEP1 1948 Present 1.9 Kalnay et al. (1996)

20CR 1871 2011 2 Compo et al. (2011)

ERA40 1958 2001 2.5 Uppala et al. (2005)

ERAInt 1979 Present 1.5 Dee et al. (2011) Gridded observation

VIC Forcings 1950 2005 6 km Schnorbus et al. (2014)

ANUSPLIN 1950 2005 10 km Hutchinson et al. (2009)

marizes the availability of the gridded observations and re-analyses.

3.3 Downscaling techniques

Seven statistical approaches are selected based on their wide use and/or potential strength in downscaling coarse-scale models to gridded observations for representing extremes.BCSD has been applied across North America (Maurer and Hidalgo, 2008; Salath, 2005; Schnorbus et al., 2014; Wood et al., 2002, 2004). Monthly minimum temperature, maximum temperature, and precipitation from GCMs or reanalyses are bias corrected, using quantile mapping, against gridded observations aggregated to the large-scale model grid.Bias corrected, spatially disaggregated monthly data are temporally disaggregated to a daily time step via random sampling of historical months. Days in the selected month are rescaled (multiplicative for precipitation and additive for temperature) to match the bias corrected monthly precipitation and average temperature (Fig. 3a). Two variations of BCSD are tested; one derives minimum and maximum temperature from mean temperature in the coarse-scale model by assuming a uniform monthly diurnal temperature range (BCSD); the other uses monthly minimum and maximum temperature directly from the large-scale model (BCSDX).

Two constructed analogue downscaling approaches are tested: BCCA and DBCCA (Maurer et al., 2010). BCCA bias corrects the large-scale temperature and precipitation using quantile mapping, as in BCSD, except on daily rather than monthly large-scale data. In the constructed analogue (CA) component, a library of observed daily coarse-resolution and corresponding high-resolution climate patterns of the variable to be downscaled is built (Hidalgo et al., 2008). Daily data are downscaled by selecting 30 days from the coarse-scale library that have the closest similarity to a given simulated day; optimal weights are determined via ridge regression and the 30 corresponding ne-scale library patterns are combined using the same weights (Maurer et al., 2010). In the DBCCA technique, a second quantile mapping bias correction is then applied at the ne scale to x drizzle and other biases caused by the linear combination of daily elds in the CA step (Fig. 3a).

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1488 A. T. Werner and A. J. Cannon: Hydrologic extremes

Figure 3. (a) Diagram of the bias corrected spatial disaggregation (BCSD), bias corrected constructed analogues (BCCA), and bias corrected climate imprint (BCCI) downscaling methods and a summary of adjustments made to these methods to create BCSD with monthly minimum and maximum temperature (BCSDX), double BCCA (DBCCA), climate imprint (CI), and BCCA corrected to BCCI (BCCAQ). (b) Workow diagram for assessment of downscaling techniques in replicating ClimDEX and hydrologic extremes.

Two climate imprint methods are tested, the CI delta method (Hunter and Meentemeyer, 2005) and bias corrected CI (BCCI), which applies quantile mapping to the interpolated series from CI (Fig. 3a). For the imprint methods, long-term averages (i.e. 30 years) from the ne-scale data provide a spatial imprint that is used to represent environmental gradients. The ratio of daily to average monthly values is multiplied by the ne-scale monthly values for a location to get the daily precipitation. This is similar for minimum and maximum temperature, except values are calculated as

the difference between the monthly mean and the daily value (Hunter and Meentemeyer, 2005).

While BCCI applies quantile mapping as a postprocessing step to the interpolated ne-scale outputs from the CI method, BCCAQ is a post-processed version of BCCA where the nal quantile mapping bias correction is based on BCCI. First, the BCCA and BCCI algorithms are run independently, and then BCCAQ corrects BCCA with BCCI. The daily BCCI outputs at each ne-scale grid point are reordered within a given month according to the daily BCCA ranks. Be-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1489

cause the optimal weights used to combine the analogues in BCCA are derived on a day-by-day basis, without reference to the full historical data set, the algorithm may be prone to Huths paradox, wherein models that are calibrated based on short-term variability may be biased and fail to produce realistic long-term trends (Benestad et al., 2008; Huth, 2004).Reordering data for each ne-scale grid point within a month effectively breaks the overly smooth representation of sub reanalysis-grid-scale spatial variability inherited from BCCI (Maraun, 2013), thereby resulting in a more accurate representation of event-scale spatial gradients; this also prevents the downscaled outputs from drifting too far from the BCCI long-term trend. Over longer timescales, the spatial variability of BCCAQ converges to that of BCCI.

Statistical methods are calibrated from 1950 to 1990 for 20CR and NCEP1, and from 1958 to 1990 and from 1979 to 1990 for ERA40 and ERAInt, respectively (Table 2). Calibration periods were selected to include the longest overlapping record between the gridded observation and reanalyses to replicate the approach taken in Werner et al. (2011).Thus, the 20CR and NCEP1 reanalyses results will serve to evaluate the gridded observations and these two reanalyses, and also to validate the calibrationvalidation approach taken with BCSD for a series of studies conducted in this region (Brger et al., 2012a, b; Schnorbus et al., 2014; Shrestha et al., 2012; Werner et al., 2013). The resulting modelling framework for these two gridded observations, four reanalysis products, and seven gridded statistical downscaling techniques is displayed in Fig. 3b. All statistical downscaling methods use precipitation and temperature as predictors and predictands.

3.4 ClimDEX

ClimDEX is a common climate indices package that computes values for 27 core indices based on daily precipitation and minimum and maximum temperature (Karl et al., 1999; Peterson, 2005; and http://etccdi.pacificclimate.org

Web End =http://etccdi.pacicclimate.org or http://www.clivar.org/panels-and-working-groups/etccdi/etccdi.php/

Web End =http://www.clivar.org/panels-and-working-groups/etccdi/ http://www.clivar.org/panels-and-working-groups/etccdi/etccdi.php/

Web End =etccdi.php/ ). These indices describe events such as the number of heavy precipitation days denoted as days where precipitation is greater than 10 mm or percentage of days when maximum temperature is greater than the 90th percentile.They do not usually represent the most extreme events conceivable, but instead represent the more extreme aspects of climate, which are known to be relevant to a broad range of impact elds and are still statistically manageable, so that they can be reliably estimated from current data for the present and future. ClimDEX has been adopted as a standard for extremes by the World Climate Research Programme (http://www.clivar.org/organization/extremes

Web End =http://www.clivar.org/organization/extremes ). Indices were computed from downscaled temperature and precipitation from seven statistical downscaling methods used with four reanalyses and two gridded observations for a total of 56 estimates of each index. The index of the annual count

Table 2. Calibration and validation periods for downscaling methods by reanalyses.

Reanalysis Calibration No. of Validation No. of product years years

NCEP1 19501990 41 19912005 15 20CR 19501990 41 19912005 15 ERA40 19581990 33 19912001 11 ERAInt 19791990 12 19912005 15

when daily minimum temperature is > 20 C, tropical nights (tr), was dropped for this analysis because this temperature threshold is not exceeded in the Peace River basin. See Table 1 in Brger et al. (2012a) for a description of indices explored in this study.

3.5 Hydrologic modelling

Hydrologic projections for the Peace River basin are derived using the Variable Inltration Capacity (VIC) model (Liang et al., 1994, 1996). The VIC model is a spatially distributed macro-scale hydrologic model that was originally developed as a soilvegetation atmosphere transfer scheme for general circulation models. It has been used to evaluate climate change impacts on global river systems (Nijssen et al., 2001) and in the mountainous western United States and BC (Elsner et al., 2010b; Hamlet and Lettenmaier, 2005, 2007; Schnorbus et al., 2014; Shrestha et al., 2012). Its spatially distributed nature makes it suitable for capturing regional variation in the hydrologic cycle due to topographic, physiographic, and climatic controls. The VIC model is also process based, allowing for a more plausible extrapolation of hydrologic processes into future climate regimes (Leavesley, 1994). The VIC model is applied at a resolution of 1/16 (approximately 2731 km2, depending upon latitude) and run at a daily time step (1 h time step for the snow model). Surface routing between grid cells is done using the linearized Saint-Venant equations (Lohmann et al., 1996).

The Finlay River above Akie River, Ingenika River above Swannell River, Parsnip River above Misinchinka River, and Peace River above Pine River sub-basins of the Peace River were calibrated to observations from Water Survey of Canada (Fig. 1). Peace River at Bennett Dam was calibrated to naturalized ow provided by BC Hydro. The sub-basins range in drainage area from 4200 to 83 900 km2 and from a minimum elevation of 392 m to a maximum of 2799 m (Table 3). All selected basins had strong calibration results over 19901995 for both the VIC Forcings and ANUSPLIN gridded observations based on the NashSutcliffe efciency score (Nash and Sutcliffe, 1970), the NashSutcliffe efciency score of the log-transformed discharge, and the percent volume bias error (Table 4). NashSutcliffe efciency score values improved, NashSutcliffe efciency score of the log-transformed discharge stayed roughly the same, and percent volume bias er-

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1490 A. T. Werner and A. J. Cannon: Hydrologic extremes

Table 3. Metadata for ve select sub-basins of the Peace River basin.

Basin Water Survey of Canada Drainage area ElevationID (km2) (m)

Mean Min Max

BCGMS 72 078FINAK 07EA005 16 000 1452 693 2799 INGEN 07EA004 4200 1503 674 2289 PARMS 07EE007 4900 1128 645 2343 PEAPN 07FA004 83 900 1126 392 2799

Table 4. Calibration and validation statistics for ve select sub-basins of the Peace River basin under the under VIC Forcings and ANUSPLIN gridded observational data sets including the NashSutcliff efciency score (NS), the NashSutcliff efciency score of the log-transformed discharge (LNS), and the percent volume bias error (%VB).

VIC Forcings ANUSPLIN

Basin Calibration Validation Calibration Validation 19901995 19851989 19901995 19851989

NS LNS %VB NS LNS %VB NS LNS %VB NS LNS %VB

BCGMS 0.64 0.81 1 0.75 0.83 12 0.72 0.82 3 0.82 0.84 3

FINAK 0.66 0.85 0 0.83 0.88 14 0.76 0.81 11 0.73 0.81 30

INGEN 0.76 0.82 0 0.82 0.78 15 0.69 0.83 10 0.72 0.85 26

PARMS 0.78 0.71 0 0.81 0.66 9 0.78 0.62 10 0.75 0.63 8

PEAPN 0.65 0.79 2 0.76 0.87 10 0.71 0.80 2 0.82 0.85 2

ror differences became larger in magnitude in the 19851989 split-sample validation period, negative in VIC Forcings, and positive in ANUSPLIN.

There are several daily streamow metrics that are useful for water resources design and management, which are also ecologically relevant (Monk et al., 2011; Richter et al., 1996;Shrestha et al., 2014b). A recent intercomparison of statistical downscaling techniques for use with daily streamow investigated the hydrologic extremes 3-day peak ow and 7-day low ow (Maurer et al., 2010). To build on that study we investigate the strength of seven downscaling methods for the same metrics using 3-day peak ow to represent ood and, 7-day low ow, drought. Two low-ow periods are investigated because the lowest discharge takes place in the months of OctoberApril in sub-basins of the Peace River (Fig. 2) and summer low ows (JulySeptember) are of interest to agriculture and ecology. Hydrologic models can have low ows in different seasons than observations due to their poor parameterization of baseow conditions and because calibration approaches favour good performance for peak ow (Naja et al., 2011). This issue can be exaggerated by down-scaling approaches (Shrestha et al., 2014b). Thus, narrowing the window over which low ows are accessed is important to prevent low ows in one season being compared to low ows in another. Peak ows are analysed between May and July.

3.6 Statistical tests

The seven statistical downscaling methods vary in their approach, which can result in differing strengths and weaknesses. We chose our statistical tests to fully evaluate these downscaling techniques for the climate and hydrologic results and to follow the framework of Brger et al. (2012a).The time period for calibration of the downscaling techniques was selected to match Brger et al. (2012a) (pre-1991, depending on the availability of the reanalyses). Longer calibration periods available for NCEP1 and 20CR were also seen as favourable when applying bias correction based downscaling methods, especially when working with extremes (Huang et al., 2014; Themel et al., 2011), and assisted with evaluating the two gridded observations. Validation was set to 19912005 to accommodate the overlap of available reanalyses, gridded observations, and observed streamow records. ERA40 is an exception, with the last full year of available record for 2001. Validation results for ERA40 are provided for 19912001.

All statistical tests used in this study are conducted at the 5 % signicance level, meaning that the tests are conducted in such a way that rejection of the null hypothesis is expected to occur in 5 % of tests when the null hypothesis is true.Statistical hypothesis testing with absolute certainty is impossible. The choice of signicance level reects a balance between the rate at which false rejection of the null hypoth-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1491

esis is expected to occur (so-called type I error) and the rate at which a given test will correctly reject the null hypothesis when it is false (the so-called power of the test), with the choice of a more conservative signicance level, such as 1 %, leading to lower power in exchange for a lower type-I error rate (e.g. von Storch and Zwiers, 1999).

Two statistical tests are applied to the ClimDEX results over the Peace River basin: the KolmogorovSmirnov (KS) test and the test for Pearsons correlation. The KS test is used to see how well the distribution of climate indices for the statistically downscaled reanalyses matches the distribution of those calculated from the gridded observations used as downscaling targets. The KS test is a nonparametric test of the equality of continuous one-dimensional probability distributions. Here, it is used to compare two samples, namely annual climate indices for the statistically downscaled re-analyses and the associated gridded observation. The KS test statistic is used to quantify the distance between empirical distribution functions of these two samples. The null hypothesis is that the two samples are drawn from the same distribution. The distributions considered under the null hypothesis have to be continuous distributions, but are otherwise unrestricted. While some of the climate indices are not strictly continuous (e.g. frost days), asymptotic critical values may still be used in the presence of a small number of ties (Janssen, 1994). Pearsons correlation is used to test the temporal correspondence between the annual climate indices for the statistically downscaled reanalyses and the associated gridded observation. Pearsons product moment correlation coefcient is used to measure the linear correlation between climate indices from downscaled reanalyses and indices from observations. The null hypothesis is that the downscaled and observed samples are not linearly correlated.

The 101 000 km2 Peace River basin is represented by 3975 grid cells at the 1/16 resolution used to run the VIC hydro-logic model. The KS test and Pearsons correlation are evaluated on each of the grid cells in the Peace River basin for each climate index. The statistical signicance of the KS test and Pearsons correlation results over the basin as a whole are measured using a eld signicance test: the Walker eld signicance test (Wilks, 2006), where the evaluation of eld signicance is done by using the minimum local p value as the global test statistic. The Walker eld signicance test was selected because it is relatively insensitive to correlations among local tests, allowing global tests based on data exhibiting both spatial and temporal correlations to be conducted. Temporal and spatial correlations between climate indices grids would require a cumbersome procedure to address correctly with conventional resampling tests. Walkers test can be seen to be closely related to the conventional eld signicance test (Storch, 1982) based on counting signicant local results, except that Walkers test statistic is based on the smallest of the K local p values, rather than the number of K local tests that are signicant at some level.

The KS test and the test for Pearsons correlation were applied to the 3-day peak ow and 7-day low ow in winter and summer for hydrologic data from the ve sub-basins of the Peace River. In this case, with the KS test the null hypothesis is that the distribution of the hydrologic extremes created by driving the VIC model with the statistically downscaled reanalyses are drawn from the same sample as those derived from driving the VIC model with the two gridded observations. The null hypothesis for Pearsons correlation is that the hydrologic extremes created by driving VIC with down-scaled reanalyses versus gridded observations are not linearly correlated.

4 Results

4.1 Gridded observations and reanalyses

Four reanalyses (NCEP1, ERA40, ERAInt, and 20CR) are compared to two gridded observations (VIC Forcings and ANUSPLIN) over the Peace River basin. Daily precipitation, minimum temperature, and maximum temperature are converted to total monthly precipitation and average monthly temperatures over the 19502005. Average minimum and maximum temperatures in ANUSPLIN and VIC Forcings are similar from year to year in most months (Figs. 4 and 5).However, prior to 1970, ANUSPLIN can be up to 5 cooler than the VIC Forcings and reanalyses from December to

February. Precipitation totals are similar from year to year for all months in the two gridded observations, except October, when precipitation difference can be up to 50 mm (Fig. 6).This could be because there is greater station coverage in the VIC Forcings and an elevation adjustment is made with ClimateWNA. Differences in these two products resulting from these factors might be more apparent in the shoulder season.

There is a warm bias in minimum temperature in 20CR and ERA40 from May to November and a cool bias in NCEP1 from March to October relative to gridded observations (Fig. 4). The biases in NCEP1 tend to be greater over part of the record in some months, such as from 1970 to 1995 in June. ERAInt is closest to gridded observa

tions for minimum temperature, but is only available after 1979. Some of the patterns seen in minimum temperature are repeated in maximum temperature (Fig. 5). NCEP1 values are noticeably cooler than observations and other reanalyses in May, June, July, September, and October in some years. In April, maximum temperatures in 20CR and NCEP1 are close to each other and roughly 5 degrees less than the other reanalyses and gridded observations. Maximum temperatures for ERA40 and ERAInt are closest to gridded observations from year to year in all months. Monthly precipitation in the NCEP1 and ERA40 reanalyses has similar magnitudes and variability as the gridded observations (Fig. 6).ERAInt is close to observations in the autumn and winter months, but has higher precipitation values in March through

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1492 A. T. Werner and A. J. Cannon: Hydrologic extremes

Jan

Feb

Mar

40302010

252015105

2015105

1950 1960 1970 1980 1990 2000

Apr

May

Jun

151050

505

0510

Average monthly minimum temperature (C)

1950 1960 1970 1980 1990 2000

051015

Jul

Aug

Sep

24681012

40246810

1950 1960 1970 1980 1990 2000

10505

Oct

Nov

Dec

251550

252015105

VIC Forcings ANUSPLIN NCEP1 ERA40 ERAInt 20CR

1950 1960 1970 1980 1990 2000

Figure 4. Monthly average minimum temperature by gridded observations (VIC Forcings and ANUSPLIN) and reanalysis (NCEP1, ERA40, ERAInt, 20CR) over the Peace River basin.

to August. 20CR stands apart from the other reanalyses and both gridded observations with consistently larger precipitation amounts, roughly twice the magnitude as observations in September through to April. However, sequencing of events is similar between 20CR and observations.

This conrms that near surface temperature and precipitation values from the selected reanalyses have different characteristics due to their different resolutions, model physics, and contributing data in the Peace River basin. The two gridded observations also displayed some dissimilarity in time.Differences between these four reanalyses in this particular region should act as a stringent test of the downscaling techniques applied. However, we expect that the time-dependent differences between gridded observations and NCEP1 for minimum and maximum temperature, and precipitation, will reduce the success rate of any of the downscaling techniques (Maurer et al., 2013). Nevertheless, we carry NCEP1 through the analysis to quantify the impacts of using a potentially awed reanalysis and also to evaluate VIC Forcings and

ANUSPLIN over their full record (19502005) with two re-analyses (NCEP1 and 20CR).

4.2 Impact of the downscaling approach and reanalyses on ClimDEX results

Downscaled minimum temperature, maximum temperature, and precipitation from seven gridded downscaling methods, two gridded observations, and four reanalyses were used to generate 26 ClimDEX indices. Results were compared to the indices generated from the respective gridded observations at their native resolution (VIC Forcings ( 6 km) and ANUS-

PLIN ( 10 km)) for their ability to match the timing (Pear

sons correlation) and distribution (KS test) of values over the Peace River basin using the Walker eld signicance test (Wilks, 2006).

In the calibration (19501990) and validation (1991 2005) periods, the VIC Forcings and ANUSPLIN data sets are similar for most temperature based indices and show some large differences for precipitation based indices (Ta-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1493

Table 5. Mean annual ClimDEX values for VIC Forcings and ANUSPLIN averaged over the Peace River basin.

Index Calibration (19501990) Validation (19912005) Units Indicator name

VIC Forcings ANUSPLIN VIC Forcings ANUSPLIN

cdd 20 19 18 19 Days Consecutive dry days csdi 5 9 5 6 Days Cold spell duration cwd 9 10 11 12 Days Consecutive wet days dtr 11 11 10.6 10.3 C Diurnal T rangefd 239 238 233 230 Days Frost daysgsl 136 131 140 138 Days Growing seasonid 109 122 102 106 Days Ice daysprcptot 703 578 742 585 mm Annual total wet-day r1mm 133 142 150 153 Days Precipitation days r10mm 17 8 17 8 Days Heavy prec. days r20mm 4 1 4 1 Days Very heavy prec. r95p 145 97 142 100 mm Very wet daysr99p 42 28 38 32 mm Extremely wet days rx1day 32 22 31 23 mm Max 1-day prec. rx5day 63 46 64 46 mm Max 5-day prec.sdii 5 4 5 4 mmday1 Simple daily intense su 7 6 7 7 Days Summer daystn10p 11 13 7 8 % Cool nightstn90p 10 9 12 14 % Warm nightstnn 37 41 35.5 37.6 C Min monthly Tn

tnx 11 11 11.5 11.8 C Max monthly Tn tx10p 11 11 9 8 % Cool daystx90p 10 10 11 14 % Warm daystxn 27 29 24.9 25.8 C Min monthly Tx

txx 27 27 27.9 27.4 C Max monthly Tx wsdi 4 5 8 12 Days Warm spell duration

ble 5), Namely, PRCPTOT, annual total wet day precipitation (> 1 mm), in ANUSPLIN is 18 and 21 % less than VIC Forcings in the calibration and validation periods, respectively. The events on a given day are larger in VIC Forcings than ANUSPLIN as shown by the higher R95p, RX1day, RX5day, R10mm, and R20mm values. Between the validation and the calibration period, PRCPTOT increases more in VIC Forcings than in ANUSPLIN. The increase in VIC Forcings comes from an increase in precipitation days (R1mm) rather than an increase in intensity. Magnitudes of the larger precipitation events actually decrease for VIC Forcings, while they increase for ANUSPLIN, although these events are still larger in VIC Forcings than ANUSPLIN in the validation period. The percentage of cool nights decreases and the duration of warm spells increases somewhat equally for both gridded observations. However, increases in the percentage of warm days and warm nights, and decreases in the percentage of cool days and duration of cold spells, are greater in ANUSPLIN than VIC Forcings, which suggests that the warming signal in ANUSPLIN is stronger. Statistically signicant increases in annual minimum temperatures were found by Rodenhuis et al. (2009) in this region. Differing trends in climate extremes are common in gridded observations due to differences in stations, interpolation tech-

niques, and potential corrections for temporal inhomogeneity. Donat et al. (2014) found that decadal trends in maximum 5-day precipitation amounts (Rx5day) over 19792008 ranged from 15 to 5 mmdecade1 in the Peace River basin

region, depending on the gridded observations they studied.

VIC Forcings included a monthly temporal adjustment to increase homogeneity (Hamlet and Lettenmaier, 2005), while ANUSPLIN did not. Additionally, stations were allowed to drop in and out on a daily bases in ANUSPLIN, whereas stations had to be available for a minimum of 1 year of consecutive days and 5 years over the record to be included in VIC Forcings. Hence, trends in some climate extremes differ for these gridded observations and may or may not match those of reality and/or reanalyses.

Irrespective of downscaling method or reanalysis, those methods calibrated and validated against the ANUSPLIN gridded observations were more successful versus those based on VIC Forcings overall (Table 6), although there were some cases where VIC Forcings passed more tests than ANUSPLIN (Table 8). For example, under the BCCA method, precipitation amounts on extremely wet days (R95p) for all reanalyses based on VIC Forcings failed the Walker eld signicance test for the Pearson correlation, while those for ANUSPLIN passed (Fig. 7). (Note: time series shown are

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1494 A. T. Werner and A. J. Cannon: Hydrologic extremes

Jan

Feb

Mar

251550

1510505

505

1950 1960 1970 1980 1990 2000

Apr

May

Jun

50510

5101520

510152025

Average monthly maximum temperature (C)

1950 1960 1970 1980 1990 2000

Jul

Aug

Sep

10152025

5101520

VIC Forcings ANUSPLIN NCEP1 ERA40 ERAInt 20CR

1950 1960 1970 1980 1990 2000

Oct

Nov

Dec

0510

1510505

151050

1950 1960 1970 1980 1990 2000

Figure 5. Monthly average maximum temperature by gridded observations (VIC Forcings and ANUSPLIN) and reanalysis (NCEP1, ERA40, ERAInt, 20CR) over the Peace River basin.

averages of all of the VIC Forcings or ANUSPLIN cells in the Peace basin, while the signicance of results was based on the Walker eld signicance of the correlation tested on each grid cell in the basin.) The largest differences in the number of tests passed primarily occur for precipitation based indices where ANUSPLIN passes more than VIC Forcings. VIC Forcings passes 29 more tests than ANUS-PLIN for DTR (Table 7). This result is not unexpected because the differences between the calibration and the validation period are precipitation related in VIC Forcings and temperature related in ANUSPLIN (Table 5). Step changes in daily temperature range (DTR) from 1950 to 2005 are apparent in ANUSPLIN (Fig. 8). DTR is a strong driver of snow-pack generation and melt, and errors in simulating realistic DTR could affect hydrologic modelling results.

The sequencing of precipitation indices, such as CWD, PRCPTOT, R10mm, R20mm, R95p, R99p, Rx1day, Rx5day, and SDII, is most difcult to replicate for all methods, especially under VIC Forcings. VIC Forcings has a higher station density than ANUSPLIN because it includes stations from

BC Hydro, the BC Ministry of Forests Lands and Natural Resource Operations, and the Ministry of Environments BC River Forecast Centre Snow Survey Network in addition to those from Environment Canada (Werner et al., 2015). The BC Hydro network provided a large number of stations in the Peace River basin, most of which were not available until the 1980s (Werner et al., 2015). The increase in the number of stations after 1980 in the VIC Forcings likely resulted in more complex spatial patterns in precipitation, despite the monthly temporal adjustment, because it is designed to maintain spatial variability (Hamlet and Letten-maier, 2005). Increased spatial variability in the validation period, coupled with a different interpolation method in VIC Forcings, could have made precipitation patterns harder to replicate with downscaling. If we are going to rely on these data sets to investigate changes to extreme climate and hydrology, we should develop a way to maintain temporal and spatial homogeneity for daily values while allowing data sets to reect natural trends. Minimizing homogeneity problems throughout the record is favourable when using gridded ob-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1495

Jan

Feb

Mar

50100150

1950 1960 1970 1980 1990 2000

Apr

May

50100150

Jun

050100150

50100200300

Total monthly precipitation (mm)

VIC Forcings ANUSPLIN NCEP1 ERA40 ERAInt 20CR

1950 1960 1970 1980 1990 2000

Jul

Aug

Sep

50100150200

050100150200

1950 1960 1970 1980 1990 2000

Oct

Nov

Dec

50100150

050100150200

50100150200

1950 1960 1970 1980 1990 2000

Figure 6. Monthly total precipitation by gridded observations (VIC Forcings and ANUSPLIN) and reanalysis (NCEP1, ERA40, ERAInt, 20CR) over the Peace River basin.

servations to calibrate statistical downscaling methods (Gut-mann et al., 2014; Livneh et al., 2013; Maurer et al., 2002).

Considering results for all downscaling methods and both gridded observations, results based on ERAInt had the highest score of all four reanalyses for the Pearson correlation and KS tests combined (Table 6). ERAInt results matched sequencing of events most often, as indicated by frequent rejection of the null hypothesis for the Pearson correlation test (Fig. 7; Table 8), and ERA40 results matched distributions most often according to the KS test (Fig. 9; Table 8).The zero-correlation null hypothesis was rejected when comparing ERAInt for the ANUSPLIN and VIC Forcings gridded observations for the number of heavy precipitation days (R10mm), but was not rejected with other reanalyses (Fig. 7).ERA40 and ERAInt monthly average minimum and maximum temperature and total precipitation matched those of the gridded observations most closely (see Sect. 4.1). ERAInt is the highest resolution (1.5 ) and both ERAInt and ERA40 excluded 19501958 in their calibration when NCEP1 and 20CR did not (Table 2), which may have avoided potential

problems with the gridded observations caused by lower station availability earlier in the record and with reanalysis data from the pre-satellite era (1979) and before the expansion and standardization of a global radiosonde network (1958 ). Results for SDII for VIC Forcings and ANUSPLIN under all seven downscaling methods show large differences between gridded observations and downscaled NCEP1 prior to 1958 (Fig. 10). Gutmann et al. (2014) tested four downscaling methods with NCEP1 focusing on the period containing satellite microwave and infrared atmospheric soundings (1979) and still found that temporal instabilities in NCEP1 contributed to failure in downscaling techniques for some metrics. Root mean square error in sea level pressure decreases from 1950 to 2008 strongly in NCEP1, somewhat in ERA40, and minimally in 20CR (see Fig. 10 in Compo et al., 2011). Assimilating only surface pressure reports and using observed monthly sea surface temperature and sea ice distributions as boundary conditions to create 20CR has resulted in a more temporally consistent product. However, it has still improved over time. Changes in 20CR in combina-

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1496 A. T. Werner and A. J. Cannon: Hydrologic extremes

Table 6. Summary of the number of tests passed for Pearsons correlations and similarity in distributions (KS test) based on the Walker eld signicance test between ClimDEX indices for downscaled reanalyses versus target gridded observation over the Peace River basin for 19912005 (19912001 ERA40), summarized by gridded observation, reanalysis, and the downscaling method. Max indicates the maximum possible tests to pass in that category.

Pearsons correlation KS test Combined

Gridded observation

VIC Force 367 578 945 ANUSPLIN 388 628 1016 Max 728 728 1456

Pearsons correlation KS test Combined

Reanalyses

NCEP1 159 284 443 20CR 147 287 434 ERA40 201 340 541 ERAInt 248 295 543 Max 364 364 728

Pearsons correlation KS test Combined

Downscaling method

BCCA 130 171 301 DBCCA 139 174 313 BCCI 131 176 307 CI 139 154 293 BCSD 56 175 231 BCSDX 48 173 221 BCCAQ 112 183 295 Max 208 208 416

tion with changes in the gridded observations over 1950 2005 have resulted in fewer passed tests for 20CR than ERA40 or ERAInt. Thus, choice of reanalysis, calibration period, and the gridded observation data set can inuence the measured success of the downscaling approach being tested, irrespective of the methods inherent strengths and weaknesses.

The highest ranked downscaling method based on the combined results for eld signicance of Pearsons correlation and the KS test for all gridded observations, reanalyses, and ClimDEX indices was DBCCA (Table 6). It tied for highest rank with CI for correlation, while BCCAQ superseded all other methods for distribution. Bias remains in results of the BCCA method for precipitation due to the linear combination of ne-scale analogues and uncorrected drizzle and related biases (Guttmann et al., 2014). All down-scaling methods, except CI, include a quantile mapping bias correction step and are expected to do well in matching distributions with their respective gridded observation. All methods except CI pass 86 % or more of the tests for distribu-

Table 7. Number of tests passed for each ClimDEX index for VIC Forcings and ANUSPLIN for 19912005 (19912001 in ERA40).

VIC Forcings ANUSPLIN Difference

cdd 48 44 4 csdi 54 54 0 cwd 19 31 12

dtr 32 3 29 fd 51 48 3 gsl 54 52 2 id 55 47 8 prcptot 24 33 9

r10mm 28 31 3

r1mm 24 36 12

r20mm 26 42 16

r95p 11 28 17

r99p 24 41 17

rx1day 14 35 21

rx5day 30 33 3

sdii 2 15 13

su 51 50 1 tn10p 52 52 0 tn90p 48 43 5 tnn 42 39 3 tnx 30 32 2

tx10p 52 52 0 tx90p 50 50 0 txn 43 44 1

txx 41 42 1

wsdi 40 39 1

tion (KS test), while CI passes 78 %. The correlation of DTR was a problem for all the downscaling methods and both gridded observations (Fig. 8) and for distribution based on ANUSPLIN (except BCCAQ), but not when based on VIC Forcings. BCCAQ in combination with ANUSPLIN matched DTR distributions for ERAInt, ERA40, and 20CR when all other methods failed, which points to the success of its approach of post-processing BCCA with a nal quantile mapping bias correction based on BCCI. As mentioned above, DTR is an important driver in snowpacks. Additionally, it plays a key role in evaporation (Shefeld et al., 2012). Rates of evaporation are an important component of projecting future water availability and drought (Sherwood and Fu, 2014).Therefore, accurately downscaling DTR should be a priority.Including minimum and maximum monthly temperature predictors in BCSDX did not improve the correlation of DTR as was hypothesized in previous studies (Brger et al., 2012a).

4.3 Impact of the downscaling approach and reanalyses on hydrologic extremes

The previous section shows how raw reanalyses and observations differ in the Peace River basin and how downscaled reanalyses can differ in their representation of climate ex-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1497

Table 8. Summary of the number of tests passed for Pearsons correlations and similarity in distributions (KS test) based on the Walker eld signicance test between ClimDEX indices for downscaled reanalyses versus target gridded observation over the Peace River basin for 19912005 (19912001) for reanalysis (ERA40) versus the downscaling method for each gridded observation.

Pearsons correlation KS test

NCEP1 20CR ERA40 ERAInt Sub NCEP1 20CR ERA40 ERAInt Sub Total

VICForcings

BCCA 14 14 14 17 59 19 21 24 18 82 141 DBCCA 15 14 15 18 62 20 22 24 18 84 146 BCCI 14 14 16 20 64 20 21 24 22 87 151 CI 13 14 17 22 66 16 14 24 18 72 138 BCSD 4 6 6 12 28 20 20 24 20 84 112 BCSDX 4 5 7 11 27 20 20 24 20 84 111 BCCAQ 15 13 14 19 61 20 21 24 20 85 146

Subtotal 79 80 89 119 135 139 168 136

ANUSPLIN

BCCA 17 11 23 20 71 22 23 24 20 89 160 DBCCA 17 13 23 24 77 21 20 24 25 90 167 BCCI 14 12 18 23 67 21 21 24 23 89 156 CI 15 14 20 24 73 15 19 24 24 82 155 BCSD 5 4 8 11 28 24 21 25 21 91 119 BCSDX 3 3 5 10 21 24 20 25 20 89 110 BCCAQ 9 10 15 17 51 22 24 26 26 98 149

Subtotal 80 67 112 129 149 148 172 159

tremes when calibrated to one gridded observation versus another. NCEP1 has routinely been used to compare the performance of statistical downscaling methods in terms of climate and hydrologic extremes (e.g. Brger et al., 2012a and Maurer et al., 2010). We thus continue our comparison of multiple gridded observations, reanalyses, and downscaling techniques for hydrologic extremes. Results are compared for 15 years from 1991 to 2005 (inclusive) for the ve sub-basins, except for ERA40 (11 years; 19912001). We evaluate methods for their ability to replicate the timing (Pearsons correlation) and distribution (KS test) of the 3-day peak ow, 7-day low ow in summer, and 7-day low ow in winter.

Irrespective of reanalysis or downscaling method, VIC hydrologic model simulations based on the VIC Forcings gridded observations passed 8 % more tests than those based on the ANUSPLIN gridded observations (Table 9), whereas for the ClimDEX indices ANUSPLIN passed 7 % more tests than VIC Forcings (Table 6). The difference in the number of tests passed is not great. Therefore, the success of the down-scaling methods does not depend strongly on which of the gridded observations is applied overall. However, the greater number of tests passed for hydrologic modelling with the VIC Forcings gridded observations could relate to VIC Forcings being created at the native resolution of the VIC hydro-logic model (1/16 ), whereas the ANUSPLIN data were created at 1/12 and remapped to 1/16 using bilinear interpolation. Additionally, a larger precipitation bias correction was required during calibration with the ANUSPLIN data than the VIC Forcings data, suggesting that ANUSPLIN precip-

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

bcca

dbcca

bcci

bcsd

bcsdx

bcca

NCEP

20CR

ERA40

ERAInt

dbcca

NCEP

20CR

ERA40

ERAInt

bcci

NCEP

20CR

ERA40

ERAInt

NCEP

20CR

ERA40

ERAInt

bcsd

NCEP

20CR

ERA40

ERAInt

bcsdx

NCEP

20CR

ERA40

ERAInt

bccaq

NCEP

20CR

ERA40

ERAInt

cdd

csdi

cwd

dtr

gsl

prcptot

r10mm

r1mm

r20mm

r95p

r99p

rx1day

rx5day

sdii

tn10p

tn90p

tnn

tnx tx10p

tx90p

txn

txx

wsdi

cdd

csdi

cwd

dtr

gsl

prcptot

r10mm

r1mm

r20mm

r95p

r99p

rx1day

rx5day

sdii

tn10p

tn90p

tnn

tnx tx10p

tx90p

txn

txx

wsdi

Figure 7. Field signicant correlations based on the Walker eld signicance test over the Peace River basin between ClimDEX indices for downscaled reanalysis versus target gridded observation, VIC Forcings (left) and ANUSPLIN (right), by the downscaling method for 19912005 (19912001 ERA40). Dark grey boxes indicate cases in which the null hypothesis is rejected at the 5 % significance level.

1498 A. T. Werner and A. J. Cannon: Hydrologic extremes

dtr vicforce NCEP

dtr anusplin NCEP

Temperature (C)

910111213

Temperature (C)

910111213

1950 1960 1970 1980 1990 2000

Year

dtr vicforce 20CR

dtr anusplin 20CR

Temperature (C)

910111213

Temperature (C)

910111213

1950 1960 1970 1980 1990 2000

Year

dtr vicforce ERA40

dtr anusplin ERA40

Temperature (C)

910111213

Temperature (C)

910111213

1950 1960 1970 1980 1990 2000

Year

dtr vicforce ERAInt

dtr anusplin ERAInt

Temperature (C)

910111213

Temperature (C)

910111213

gridobs bcca dbcca bccicibcsd bcsdx bccaq

1950 1960 1970 1980 1990 2000

Year

Figure 8. Time series of average DTR from VIC Forcings (left) and ANUSPLIN (right) for NCEP1 (top), 20CR (second), ERA40 (third), and ERAInt (bottom) downscaled using BCCA, DBCCA, BCCI, CI, BCSD, BCSDX, and BCCAQ over the Peace River basin.

itation is less representative than VIC Forcings. Out of the two statistical tests and three metrics the only case where ANUSPLIN passed more tests than VIC Forcings was for correlation in summer 7-day low ow (Table 10), especially when driven with NCEP1 and 20CR downscaled via BCCA and DBCCA. Similar results were found for ANUSPLIN and BCCA and DBCCA with the ClimDEX indices (Sect. 4.2).

This suggests that there is potential for ClimDEX results to act as a predictor of hydrologic extremes.

When considering results regardless of gridded observation or downscaling technique, the number of tests passed under ERA40 was the highest overall (Table 9). Additionally, the number of tests passed for Pearsons correlation and the KS test were both highest for ERA40. The truncated val-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1499

bcca

dbcca

bcci

bcsd

bcsdx

bcca

NCEP

20CR

ERA40

ERAInt

dbcca

NCEP

20CR

ERA40

ERAInt

bcci

NCEP

20CR

ERA40

ERAInt

NCEP

20CR

ERA40

ERAInt

bcsd

NCEP

20CR

ERA40

ERAInt

bcsdx

NCEP

20CR

ERA40

ERAInt

bccaq

NCEP

20CR

ERA40

ERAInt

cdd

csdi

cwd

dtr

gsl

prcptot

r10mm

r1mm

r20mm

r95p

r99p

rx1day

rx5day

sdii

tn10p

tn90p

tnn

tnx tx10p

tx90p

txn

txx

wsdi

cdd

csdi

cwd

dtr

gsl

prcptot

r10mm

r1mm

r20mm

r95p

r99p

rx1day

rx5day

sdii

tn10p

tn90p

tnn

tnx tx10p

tx90p

txn

txx

wsdi

Figure 9. Field signicant similarities in distributions based on the Walker eld signicance test over the Peace River basin between ClimDEX indices for downscaled reanalysis versus target gridded observation, VIC Forcings (left), and ANUSPLIN (right), by the downscaling method for 19912005 (19912001 ERA40). Dark grey boxes indicate cases in which the null hypothesis is not rejected at the 5 % signicance level.

idation period for ERA40, 19902001 versus 19902005 for other reanalyses, could have avoided some challenging hydrologic extreme events in 20022005. However, ERAInt, which was validated over 19902005, passed nearly the same number of tests as ERA40. Thus, the shorter calibration period in ERA40 and ERAInt avoids step changes in the gridded observations and reanalyses prior to 1958. Peculiarities with the gridded observations were apparent from 1950 to 1958 for the monthly average minimum and maximum temperatures (Figs. 4 and 5) and for the DTR and SDII ClimDEX indices (Figs. 8 and 10). Avoiding these years could have reduced artefacts in the downscaled products and hydro-logic model results. Nevertheless, many studies have demonstrated that ERA40 and ERAInt are superior products versus NCEP1 (Donat et al., 2014; Ma et al., 2008, 2009; Sill-mann et al., 2013a). In our own analysis ERA40 and ERAInt have similar timing and magnitude in minimum and maximum temperature and precipitation (Figs. 4, 5, and 6) to the gridded observations when NCEP1 and 20CR do not. These results conrm that downscaling methods will succeed when applied to reanalyses that have correct timing, magnitude, and trends such as ERA40 and ERAInt, more so than when applied to reanalyses such as NCEP1 and 20CR that have irregular step changes (Maraun, 2013). We should be able to assume that although the biases in GCMs will be greater than

Table 9. Summary of the number of tests passed for Pearsons correlations and similarity in distributions (KS test) based on the Walker eld signicance test between hydrologic extremes for downscaled reanalyses versus target gridded observation over the Peace basin for 19912005 (19912001 ERA40), summarized by gridded observation, reanalysis, and downscaling method. Max indicates the maximum possible tests to pass in that category.

Pearsons correlation KS test Combined

Gridded observation

VIC Force 309 404 713 ANUSPLIN 310 350 660 Max 420 420 840

Pearsons correlation KS test Combined

Reanalyses

NCEP1 135 188 323 20CR 125 181 306 ERA40 180 196 376 ERAInt 179 189 368 Max 210 210 420

Pearsons correlation KS test Combined

Downscaling method

BCCA 102 96 198 DBCCA 104 111 215 BCCI 107 111 218 CI 99 87 186 BCSD 49 119 168 BCSDX 48 119 167 BCCAQ 110 111 221 Max 120 120 240

those found in reanalyses, they are consistent over time. The strength of downscaling methods when downscaling ERA40 and ERAInt versus NCEP1 and 20CR was also found with the ClimDEX indices.

The BCCAQ method was the best overall performer for the three hydrologic extremes. It was the best method according to Pearsons correlation and tied for second place with DBCCA and BCCI, after BCSD and BCSDX, for the KS test. BCSD and BCSDX passed the fewest number of tests for correlation, while CI passed the fewest for distribution.In the case of ClimDEX, BCCAQ ranked third after BCCA and BCCI. The strength of the BCCAQ method when tested in terms of basin-wide hydrologic modelling and hydrologic extremes, rather than in terms of ClimDEX indices at individual grid cells, comes from the maintenance of daily spatial patterns resulting from the combination of BCCA and BCCI methods. Event-scale spatial gradients and magnitudes are preserved by reordering the BCCI outputs based on the rank order structure from BCCA. In effect, this removes the overly smooth representation of sub reanalysis-grid-scale variability

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1500 A. T. Werner and A. J. Cannon: Hydrologic extremes

Table 10. Number of basins where the null hypothesis that the downscaled and observed (VIC Forcings and ANUSPLIN) derived 3-day peak ows are not linearly correlated was rejected and the number of basins where the null hypothesis that the downscaled and observed based distributions are drawn from the same sample was not rejected, by downscaling method/reanalysis combinations for 19912005 (19912001 ERA40).

Pearsons correlation KS test

NCEP1 20CR ERA40 ERAInt Sub NCEP1 20CR ERA40 ERAInt Sub Total

BCCA 2 2 5 5 14 5 5 5 1 16 30 DBCCA 1 3 5 5 14 5 5 5 5 20 34 BCCI 2 5 5 5 17 5 5 5 5 20 37 CI 5 2 5 5 17 5 5 5 5 20 37 BCSD 3 2 4 2 11 5 5 5 5 20 31 BCSDX 3 3 4 2 12 5 5 5 5 20 32 BCCAQ 3 5 5 5 18 5 5 5 5 20 38

Subtotal 19 22 33 29 103 35 35 35 31 136

BCCA 5 0 5 5 15 4 4 4 1 13 28 DBCCA 5 1 5 5 16 5 2 4 5 16 32 BCCI 5 2 4 5 16 5 2 4 5 16 32 CI 4 0 5 5 14 5 2 4 5 16 30 BCSD 2 0 3 3 8 5 5 5 5 20 28 BCSDX 2 0 3 3 8 5 5 4 5 19 27 BCCAQ 5 3 5 5 18 5 2 5 5 17 35

Subtotal 28 6 30 31 95 34 22 30 31 117

from BCCI (Maraun, 2013) and largely corrects remnant biases in magnitude from BCCA (Guttmann et al., 2014). Spatial covariability is much more relevant in hydrologic modelling than the comparison of climate indices between products on a grid cell to grid cell basis. This method is also better at maintaining long-term trends, which might explain failed tests in some of the sub-basins when downscaling NCEP1 and 20CR, which, as shown earlier, exhibit inhomogeneities between calibration and validation periods. BCCAQ could be failing for the right reason when the trend in VIC Forcings or ANUSPLIN for a given metric is opposite that in NCEP1 or 20CR. BCCAQ is the only method to pass the Pearson correlation and KS test in all ve sub-basins when downscaling ERA40 or ERAInt to VIC Forcings or ANUSPLIN for all three hydrologic extremes. BCCAQ has overcome some of the challenges of BCCA that Maurer et al. (2010) would not have been able to nd using NCEP1 alone as surrogate GCM. It is also more successful than the BCCI method, which is analogous to the statistical downscaling and bias correction (SDBC) method in Ahmed et al. (2013) and asynchronous regression (AR) in Gutmann et al. (2014), by avoiding overestimates of extreme events at aggregate scales (Maraun, 2013).

The BCSD methods pass the most tests for distribution for all basins and reanalyses, while they fail more tests than any other downscaling method for correlation due to their reliance on random sampling of historical months when temporally disaggregating from the monthly to daily time step (Table 6). Thus, these methods will get the frequency and magnitude of events correct, but will get the timing of when these

VICForcings

ANUSPLIN

events occur wrong. Again, including the minimum and maximum temperatures from the large-scale model (reanalysis) does not improve the number of tests passed with BCSDX versus BCSD. For 3-day peak ow (Table 11; Fig. 11) and 7-day low ow in summer (Table 10; Fig. 12) these methods pass the majority of tests for correlation. Very few tests are passed for correlation in 7-day low ow in winter (Table 12; Fig. 13). Winter low ows are challenging to monitor and to model. There could be ice on the river causing the stage discharge relationships to be incorrect. Also, as mentioned, models are not parametrized or calibrated to best represent base ow. However, BCSD and BCSDX have more trouble than any of the other downscaling methods. Due to the re-sampling of daily events from the historical gridded observations there can be precipitation occurring in combination with temperatures warm enough to generate runoff (Fig. 14). This is because of the stochastic resampling of the historical precipitation, but is also related to temperature since runoff is occurring when conditions should be near freezing. Additionally, the random selection of months from the historical record can lead to large discontinuities across month boundaries, such as in DecemberJanuary (Fig. 14). This is when it is important to get daily events from the GCM or reanalyses (e.g. as in the CI, BCCI, BCCA, DBCCA, and BCCAQ methods). As calibrated, the VIC model is known to have limited performance for low ows and additional errors were suspected to have been contributed by BCSD in downscaled 20C3M GCM results (Shrestha et al., 2014b). Some sharp spikes on the rising limb of the hydrograph suggest rain-on-

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1501

sdii vicforce NCEP

sdii anusplin NCEP

mm day~1

34567

mm day~1

34567

1950 1960 1970 1980 1990 2000

Year

sdii vicforce 20CR

sdii anusplin 20CR

mm day~1

34567

mm day~1

34567

1950 1960 1970 1980 1990 2000

Year

sdii vicforce ERA40

sdii anusplin ERA40

mm day~1

34567

mm day~1

34567

1950 1960 1970 1980 1990 2000

Year

sdii vicforce ERAInt

sdii anusplin ERAInt

mm day~1

34567

mm day~1

34567

gridobs bcca dbcca bccicibcsd bcsdx bccaq

1950 1960 1970 1980 1990 2000

Year

Figure 10. Time series of average SDII from VIC Forcings (left) and ANUSPLIN (right) for NCEP1 (top), 20CR (second), ERA40 (third), and ERAInt (bottom) downscaled using BCCA, DBCCA, BCCI, CI, BCSD, BCSDX, and BCCAQ over the Peace River basin.

snow events caused by the downscaling-driven results that are not displayed in the runs based on gridded observations.The CI method is the closest to the delta method that we have investigated. The median and ranges for CI are much lower for winter 7-day low ow (not shown). The poorer perfor-

mance of the CI method for the KS test is due to the lack of quantile mapping bias correction in this method.

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1502 A. T. Werner and A. J. Cannon: Hydrologic extremes

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

7000

6000

5000

4000

3000

2000

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

200040006000

1992 1996 2000 2004

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

0.00.40.8

m3 s

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

7000

6000

5000

4000

3000

2000

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

base

bcca

dbcca

bcci

bcsd

bcsdx

bccaq

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

200040006000

1992 1996 2000 2004

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

Year

2000 3000 4000 5000 6000 7000

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

0.00.40.8

m3 s

1 1

m3 s

Figure 11. Boxplots, time series, and distributions of 3-day peak ow in the spring months (MayJuly) for NCEP1, 20CR, ERA40, and ERAInt in the BCGMS basin based on VIC Forcings (top) and ANUSPLIN (bottom). Legend same as Fig. 9.

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1503

Table 11. As in Table 10 but for summer 7-day low ow.

Pearsons correlation KS test

NCEP1 20CR ERA40 ERAInt Sub NCEP1 20CR ERA40 ERAInt Sub Total

VICForcings

BCCA 3 2 5 5 15 5 5 5 5 20 35 DBCCA 3 2 5 5 15 5 5 5 5 20 35 BCCI 3 4 5 5 17 5 5 5 5 20 37 CI 2 4 5 5 16 5 5 5 5 20 36 BCSD 2 3 3 4 12 5 5 5 5 20 32 BCSDX 2 2 3 4 11 5 5 5 5 20 31 BCCAQ 4 3 5 5 17 5 5 5 5 20 37

Subtotal 19 20 31 33 103 35 35 35 35 140

ANUSPLIN

BCCA 5 4 5 5 19 5 5 5 1 16 35 DBCCA 5 5 5 5 20 5 5 5 5 20 40 BCCI 3 5 5 5 18 5 5 5 5 20 38 CI 1 5 5 5 16 5 5 5 5 20 36 BCSD 1 2 4 5 12 5 5 5 5 20 32 BCSDX 1 2 4 5 12 5 5 5 5 20 32 BCCAQ 3 5 5 5 18 5 5 5 5 20 38

Subtotal 19 28 33 35 115 35 35 35 31 136

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

20060010001400

1992 1996 2000 2004

Year

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

20060010001400

1992 1996 2000 2004

Year

Figure 12. Time series of 7-day low ow in the summer months (JulySeptember) for NCEP1, 20CR, ERA40, and ERAInt in the BCGMS basin based on VIC Forcings (top) and ANUSPLIN (bottom). Legend same as Fig. 9.

5 Conclusions

We have tested the applicability of seven techniques for downscaling coarse-scale climate models in terms of ClimDEX indices and hydrologic extremes. The seven approaches investigated include several methods commonly used in hydrologic modelling. Some of these had been explored before (i.e. BCSD and BCCA), but not using multiple reanalyses. Choice of reanalysis was found to affect the number of tests passed for a given downscaling technique. Downscaling methods were more successful under ERA40 or ERAInt than they were under NCEP1 or 20CR.

The quality of reanalyses and gridded observations changed over the calibration period due to changes in availability of satellite/radiosonde data and station observations. NCEP1, the reanalysis used as a surrogate GCM in many previous downscaling intercomparisons, had an obviously erroneous step change in temperature over the Peace River basin. Between the calibration and the validation period, changes in ClimDEX indices were greater for precipitation with VIC Forcings but greater for temperature with ANUSPLIN. Thus, trends in ClimDEX indices differed in these gridded observations. ANUSPLIN passed 5 % more tests than VIC Forcings, mostly for precipitation-related ClimDEX indices. Through

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1504 A. T. Werner and A. J. Cannon: Hydrologic extremes

Table 12. As in Table 10 but for winter 7-day low ow.

Pearsons correlation KS test

NCEP1 20CR ERA40 ERAInt Sub NCEP1 20CR ERA40 ERAInt Sub Total

VICForcings

BCCA 5 5 5 5 20 5 5 5 5 20 40 DBCCA 5 5 5 5 20 5 5 5 5 20 40 BCCI 5 5 5 5 20 5 5 5 5 20 40 CI 5 5 4 5 19 4 2 2 0 8 27 BCSD 0 0 2 0 2 5 5 5 5 20 22 BCSDX 0 0 2 0 2 5 5 5 5 20 22 BCCAQ 5 5 5 5 20 5 5 5 5 20 40

Subtotal 25 25 28 25 103 34 32 32 30 128

ANUSPLIN

BCCA 5 5 4 5 19 1 3 4 3 11 30 DBCCA 5 5 4 5 19 2 3 5 5 15 34 BCCI 5 4 5 5 19 2 3 5 5 15 34 CI 5 5 3 4 17 0 0 0 3 3 20 BCSD 0 0 2 2 4 4 5 5 5 19 23 BCSDX 0 1 2 0 3 5 5 5 5 20 23 BCCAQ 5 4 5 5 19 1 3 5 5 14 33

Subtotal 25 24 25 26 100 15 22 29 31 97

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

100300500

1992 1996 2000 2004

Year

BCGMS NCEP1

BCGMS 20CR

BCGMS ERA40

BCGMS ERAInt

100300500

1992 1996 2000 2004

Year

Figure 13. Time series of 7-day low ow in the winter months (NovemberApril) for NCEP1, 20CR, ERA40, and ERAInt in the BCGMS basin based on VIC Forcings (top) and ANUSPLIN (bottom). Legend same as Fig. 9.

this work we learned a lot about these gridded observations and discovered evaluation procedures that will be useful for future studies.

BCSDX, DBCCA, and BCCAQ downscaling methods had not been evaluated in terms of ClimDEX indices and hydro-logic extremes before now. The BCSDX method included minimum and maximum temperature from the reanalyses instead of mean as is done in BCSD, but this did not improve its ability to resolve temperature indices, such as diurnal temperature range or hydrologic extremes. DBCCA was an improvement over BCCA and passed the greatest number of tests for the ClimDEX indices. The double bias correction

proved capable of reducing some of the drizzle and remnant bias in precipitation amounts found in BCCA. The BCCAQ method, which combines BCCA and BCCI, performed well in terms of number of tests passed for the ClimDEX indices, but it really shone for use with modelling hydrologic extremes. In this context, it exceeded all other methods. BCCAQ provides a more accurate representation of event-scale spatial gradients, removing the overly smooth representation of sub reanalysis-grid-scale variability inherited from BCCI and correcting biases from BCCA. These attributes are important for simulating the climate events that occur over a basin that drive runoff. All methods passed correlation and

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1505

0100200300400500

base bcca dbcca bcci ci bcsd bcsdx bccaq

INGEN

Discharge ( m sec)

1991/01/01 1992/01/01

BCGMS

base bcca dbcca bcci ci bcsd bcsdx bccaq

Discharge ( m sec)

010002000300040005000

Figure 14. Time series of daily streamow in the BCGMS basin as driven by ANUSPLIN (base) and ERA40 downscaled to ANUS-PLIN with the BCCA, DBCCA, BCCI, CI, BCSD, BCSDX, and BCCAQ methods over 19912005.

distribution tests for 3-day peak ow and 7-day low ow in summer for the majority of sub-basins and reanalyses. BCSD and BCSDX failed all or most correlation tests and CI failed all or most distribution tests for 7-day low ow in winter. Based on results from this study, use of a daily downscaling method, such as BCCAQ, in conjunction with a rigorously constructed and validated observational data set, is recommended to supplement the existing hydrologic modelling efforts at PCIC and improve projections of hydrologic extremes.

We can build on this work to develop tools that predict changes to hydrologic extremes from changes in climate extremes without the direct application of a hydrologic model. Similar emulations have been made by drawing on the relationship between GCMs and hydrologic model projections (Schnorbus and Cannon, 2014) and by identifying relationships between GCMs and RCMs (Li et al., 2011). The next step is to identify which of the 26 ClimDEX indices are predictors of 3-day peak ow and 7-day low ow and avoid those downscaling methods that simulate them poorly.

Acknowledgements. We are grateful to three anonymous reviewers for providing valuable feedback. We thank David Bronaugh for developing and making available the climdex.pcic R package that expedited this work. Belaid Moas assistance with the Compute Canada/WestGrid/University Systems is also appreciated. Hailey

Ekstrand provided useful assistance with hypsometric information for the ve sub-basins of the Peace River basin upstream of Taylor. Thoughtful and thorough reviews were provided by Markus Schnorbus and Francis Zwiers of PCIC that greatly improved this work. We appreciate the easy access to reanalyses provided by http://reanalysis.org

Web End =reanalysis.org and thank the centres that contributed NCEP1, 20CR, ERA40, and ERAInt. We also thank Dan McKenney of Natural Resources Canada for sharing the ANUSPLIN gridded observational product for Canada and Katrina Bennett for constructing the VIC Forcings gridded observational product for British Columbia.

Edited by: R. Uijlenhoet

References

Abatzoglou, J. T. and Brown, T. J.: A comparison of statistical downscaling methods suited for wildre applications, Int. J. Climatol., 32, 772780, 2012.

Ahmed, K. F., Wang, G., Silander, J., Wilson, A. M., Allen, J. M.,

Horton, R., and Anyah, R.: Statistical downscaling and bias correction of climate model outputs for climate change impact assessment in the U.S. northeast, Global Planet. Change, 100, 320 332, 2013.

Benestad, B. E., Hanssen-Bauer, I., and Chen, D.: Chapter 8:

Reducing Uncertainties, in: Emperical-Statistical Downscaling, World Scientic, Singapore, 2008.

Bennett, K. E., Werner, A. T., and Schnorbus, M.: Uncertainties in

Hydrologic and Climate Change Impact Analyses in Headwater Basins of British Columbia, J. Climate, 25, 57115730, 2012.Brger, G., Schulla, J., and Werner, A. T.: Estimates of future ow, including extremes, of the Columbia River headwaters, Water Resour. Res., 47, W10520, doi:http://dx.doi.org/10.1029/2010WR009716

Web End =10.1029/2010WR009716 http://dx.doi.org/10.1029/2010WR009716

Web End = , 2011.Brger, G., Murdock, T. Q., Werner, A. T., Sobie, S. R., and Cannon, A. J.: Downscaling extremes an intercomparison of multiple statistical methods for present climate, J. Climate, 25, 4366 4388, 2012a.

Brger, G., Murdock, T. Q., Werner, A. T., Sobie, S. R., and Cannon,A. J.: Downscaling extremes an intercomparison of multiple methods for future climate, J. Climate, 26, 34293449, 2012b.Cannon, A. J., Sobie, S. R., and Murdock, T. Q.: Bias Correction of

GCM Precipitation by Quantile Mapping: How Well Do Methods Preserve Changes in Quantiles and Extremes?, J. Climate, 28, 69386959, 2015.

Clavet-Gaumont, J., Sushama, L., Khaliq, M. N., Huziy, O., and

Roy, R.: Canadian RCM projected changes to high ows for Qubec watersheds using regional frequency analysis, Int. J. Climatol., 33, 29402955, 2013.

Compo, G. P., Whitaker, J. S., Sardeshmukh, P. D., Matsui, N., Allan, R. J., Yin, X., Gleason, B. E., Vose, R. S., Rutledge, G., Bessemoulin, P., Brnnimann, S., Brunet, M., Crouthamel, R. I., Grant, A. N., Groisman, P. Y., Jones, P. D., Kruk, M. C., Kruger,A. C., Marshall, G. J., Maugeri, M., Mok, H. Y., Nordli, ., Ross,T. F., Trigo, R. M., Wang, X. L., Woodruff, S. D., and Worley,S. J.: The Twentieth Century Reanalysis Project, Q. J. Roy. Meteor. Soc., 137, 128, 2011.

Cunderlik, J. M. and Ouarda, T. B. M. J.: Trends in the timing and magnitude of oods in Canada, J. Hydrol., 375, 471480, 2009.

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1991/01/01 1992/01/01

1506 A. T. Werner and A. J. Cannon: Hydrologic extremes

Cunderlik, J. M. and Simonovic, S. P.: Inverse ood risk modelling under changing climatic conditions, Hydrol. Process., 21, 563 577, 2007.

Cunderlik, J. M., Ouarda, T. B. M. J., and Bobe, B.: On the objective identication of ood seasons, Water Resour. Res., 40, W01520, doi:http://dx.doi.org/10.1029/2003WR002295

Web End =10.1029/2003WR002295 http://dx.doi.org/10.1029/2003WR002295

Web End = , 2004.

Daly, C., Neilson, R. P., and Phillips, D. L.: A statistical-topographic model for mapping climatological precipitation over mountainous terrain, J. Appl. Meteorol., 33, 140158, 1994.

Dee, D. P., Uppala, S. M., Simmons, A. J., et al.: The ERA-Interim reanalysis: conguration and performance of the data assimilation system, Q. J. Roy. Meteorol. Soc., 137, 553597, 2011.

Demarchi, D. A.: An introduction to the ecoregions of British Columbia, Ecosystem Information Section, Knowledge Management Branch, Ministry of Environment, Victoria, British Columbia, Canada, 1996.

Donat, M. G., Sillmann, J., Wild, S., Alexander, L. V., Lippmann,T., and Zwiers, F. W.: Consistency of Temperature and Precipitation Extremes across Various Global Gridded In Situ and Re-analysis Datasets, J. Climate, 27, 50195035, 2014.

Elsner, M. M., Cuo, L., Voisin, N., Deems, J. S., Hamlet, A. F., Vano, J. A., Mickelson, K. E. B., Lee, S.-Y., and Lettenmaier,D. P.: Implications of 21st century climate change for the hydrology of Washington State, Climatic Change, 102, 225260, 2010a.

Eum, H.-I., Dibike, Y., Prowse, T., and Bonsal, B.: Inter-comparison of high-resolution gridded climate data sets and their implication on hydrological model simulation over the Athabasca Watershed, Canada, Hydrol. Process., 28, 42504271, 2014.

Gleckler, P. J., Taylor, K. E., and Doutriaux, C.: Performance metrics for climate models, J. Geophys. Res., 113, D06104, doi:http://dx.doi.org/10.1029/2007JD008972

Web End =10.1029/2007JD008972 http://dx.doi.org/10.1029/2007JD008972

Web End = , 2008.

Gutmann, E., Pruitt, T., Clark, M. P., Brekke, L., Arnold, J. R., Raff,D. A., and Rasmussen, R. M.: An intercomparison of statistical downscaling methods used for water resource assessments in the United States, Water Resour. Res., 50, 71677186, 2014.Hamlet, A. F. and Lettenmaier, D. P.: Production of Temporally

Consistent Gridded Precipitation and Temperature Fields for the Continental United States, J. Hydrometeorol., 6, 330336, 2005.Hamlet, A. F. and Lettenmaier, D. P.: Effects of 20th century warming and climate variability on ood risk in the western U.S., Water Resour. Res., 43, W06427, doi:http://dx.doi.org/10.1029/2006WR005099

Web End =10.1029/2006WR005099 http://dx.doi.org/10.1029/2006WR005099

Web End = , 2007.

Hidalgo, H. G., Dettinger, M. D., and Cayan, D. R.: Downscaling with constructed analogues: daily precipitation and temperature elds over the United States, California Energy Commission, PIER Energy Related Environmental Research, CEC-500-2007-123, 2008.

Hofer, M., Marzeion, B., and Mlg, T.: Comparing the skill of different reanalyses and their ensembles as predictors for daily air temperature on a glaciated mountain (Peru), Clim. Dynam., 39, 19691980, 2012.

Hopkinson, R. F., McKenney, D. W., Milewska, E. J., Hutchinson,M. F., Papadopol, P., and Vincent, L. A.: Impact of Aligning Cli-

matological Day on Gridding Daily MaximumMinimum Temperature and Precipitation over Canada, J. Appl. Meteorol. Clim., 50, 16541665, 2011.

Huang, S., Krysanova, V., and Hattermann, F. F.: Does bias correction increase reliability of ood projections under climate change? A case study of large rivers in Germany, Int. J. Climatol., 34, 37803800, 2014.

Hunter, R. D. and Meentemeyer, R. K.: Climatologically Aided Mapping of Daily Precipitation and Temperature, J. Appl. Meteorol., 44, 15011510, 2005.

Hutchinson, M. F., McKenney, D. W., Lawrence, K., Pedlar, J. H., Hopkinson, R. F., Milewska, E., and Papadopol, P.: Development and Testing of Canada-Wide Interpolated Spatial Models of Daily MinimumMaximum Temperature and Precipitation for 19612003, J. Appl. Meteorol. Clim., 48, 725741, 2009.

Huth, R.: Sensitivity of Local Daily Temperature Change Estimates to the Selection of Downscaling Models and Predictors, J. Climate, 17, 640652, 2004.

IPCC: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment, Report of the Intergovernmental Panel on Climate Change, edited by: Stocker, T. F., Qin, D.,Plattner, G.-K., Tignor, M., Allen, S. K., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P. M., Cambridge University Press, Cambridge, UK and New York, NY, USA, 1535 pp., 2013.

Janssen, A.: Two-sample goodness-of-t tests when ties are present,J. Stat. Plan. Infer., 39, 399424, 1994.

Joshi, D., St-Hilaire, A., Daigle, A., and Ouarda, T. B. M. J.: Databased comparison of Sparse Bayesian Learning and Multiple Linear Regression for statistical downscaling of low ow indices, J. Hydrol., 488, 136149, 2013.

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K. C., Ropelewski, C., Wang, J., Jenne, R., and Joseph, D.: The NCEP/NCAR 40-Year Reanalysis Project, B. Am. Meteorol. Soc., 77, 437471, 1996.

Karl, T. R., Nicholls, N., and Ghazi, A.: CLIVAR/GCOS/WMO workshop on indices and indicators for climate extremes: Workshop summary, Climatic Change, 42, 37, 1999.

Knutti, R., Allen, M. R., Friedlingstein, P., Gregory, J. M., Hegerl,G. C., Meehl, G. A., Meinshausen, M., Murphy, J. M., Plattner,G.-K., Raper, S. C .B., Stocker, T. F., Stott, P. A., Teng, H., and Wigley, T. M. L.: A Review of Uncertainties in Global Temperature Projections over the Twenty-First Century, J. Climate, 21, 26512663, 2008.

Leavesley, G. H.: Modeling the effects of climate change on water resources a review, Climatic Change, 28, 159177, 1994.

Li, G., Zhang, X., Zwiers, F., and Wen, Q. H.: Quantication of Uncertainty in High-Resolution Temperature Scenarios for North America, J. Climate, 25, 33733389, 2011.

Liang, X., Lettenmaier, D. P., Wood, E. F., and Burges, S. J.: A simple hydrologically based model of land surface water and energy uxes for general circulation models, J. Geophys. Res., 99, 1441514428, 1994.

Liang, X., Wood, E. F., and Lettenmaier, D. P.: Surface soil moisture parameterization of the VIC-2L model: Evaluation and modication, Global Planet. Change, 13, 195206, 1996.

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

A. T. Werner and A. J. Cannon: Hydrologic extremes 1507

Livneh, B., Rosenberg, E. A., Lin, C., Nijssen, B., Mishra, V., Andreadis, K. M., Maurer, E. P., and Lettenmaier, D. P.: A Long-Term Hydrologically Based Dataset of Land Surface Fluxes and States for the Conterminous United States: Update and Extensions , J. Climate, 26, 93849392, 2013.

Lohmann, D., Nolte-Holube, R., and Raschke, E.: A large-scale horizontal routing model to be coupled to land surface parametrization schemes, Tellus A, 48, 708721, 1996.

Ma, L., Zhang, T., Li, Q., Frauenfeld, O. W., and Qin, D.: Evaluation of ERA-40, NCEP-1, and NCEP-2 reanalysis air temperatures with ground-based measurements in China, J. Geophys.Res., 113, D15115, doi:http://dx.doi.org/10.1029/2007JD009549

Web End =10.1029/2007JD009549 http://dx.doi.org/10.1029/2007JD009549

Web End = , 2008.

Ma, L., Zhang, T., Frauenfeld, O. W., Ye, B., Yang, D., and Qin,D.: Evaluation of precipitation from the ERA-40, NCEP-1, and NCEP-2 Reanalyses and CMAP-1, CMAP-2, and GPCP-2 with ground-based measurements in China, J. Geophys. Res., 114, D09105, doi:http://dx.doi.org/10.1029/2008JD011178

Web End =10.1029/2008JD011178 http://dx.doi.org/10.1029/2008JD011178

Web End = , 2009.

Maraun, D.: Nonstationarities of regional climate model biases in European seasonal mean temperature and precipitation sums, Geophys. Res. Lett., 39, L06706, doi:http://dx.doi.org/10.1029/2012GL051210

Web End =10.1029/2012GL051210 http://dx.doi.org/10.1029/2012GL051210

Web End = , 2012.

Maraun, D.: Bias Correction, Quantile Mapping, and Downscaling: Revisiting the Ination Issue, J. Climate, 26, 21372143, 2013.

Maurer, E. P. and Hidalgo, H. G.: Utility of daily vs. monthly large-scale climate data: an intercomparison of two statistical downscaling methods, Hydrol. Earth Syst. Sci., 12, 551563, doi:http://dx.doi.org/10.5194/hess-12-551-2008

Web End =10.5194/hess-12-551-2008 http://dx.doi.org/10.5194/hess-12-551-2008

Web End = , 2008.

Maurer, E. P., Wood, A. W., Adam, J. C., Lettenmaier, D. P., and Nijssen, B.: A Long-Term Hydrologically Based Dataset of Land Surface Fluxes and States for the Conterminous United States,J. Climate, 15, 32373251, 2002.

Maurer, E. P., Hidalgo, H. G., Das, T., Dettinger, M. D., and Cayan,D. R.: The utility of daily large-scale climate data in the assessment of climate change impacts on daily streamow in California, Hydrol. Earth Syst. Sci., 14, 11251138, doi:http://dx.doi.org/10.5194/hess-14-1125-2010

Web End =10.5194/hess- http://dx.doi.org/10.5194/hess-14-1125-2010

Web End =14-1125-2010 , 2010.

Maurer, E. P., Das, T., and Cayan, D. R.: Errors in climate model daily precipitation and temperature output: time invariance and implications for bias correction, Hydrol. Earth Syst. Sci., 17, 21472159, doi:http://dx.doi.org/10.5194/hess-17-2147-2013

Web End =10.5194/hess-17-2147-2013 http://dx.doi.org/10.5194/hess-17-2147-2013

Web End = , 2013.McKenney, D. W., Hutchinson, M. F., Papadopol, P., Lawrence, K.,

Pedlar, J., Campbell, K., Milewska, E., Hopkinson, R. F., Price,D., and Owen, T.: Customized Spatial Climate Models for North America, B. Am. Meteorol. Soc., 92, 16111622, 2011.

Monk, W. A., Peters, D. L., Allen Curry, R., and Baird, D. J.: Quantifying trends in indicator hydroecological variables for regime-based groups of Canadian rivers, Hydrol. Process., 25, 3086 3100, 2011.

Murdock, T. Q., Cannon, A. J., and Sobie, S. R.: Statistical down-scaling of future climate projections for North America, Report on Contract No: KM040-131148/A, Prepared for Environment Canada, Pacic Climate Impacts Consortium, Victoria, BC, Canada, 2014.

Naja, M. R., Moradkhani, H., and Jung, I. W.: Assessing the uncertainties of hydrologic model selection in climate change impact studies, Hydrol. Process., 25, 28142826, 2011.

Nash, J. E. and Sutcliffe, J. V.: River ow forecasting through conceptual models part I A discussion of principles, J. Hydrol., 10, 282290, 1970.

Nijssen, B., Schnur, R., and Lettenmaier, D. P.: Global retrospective estimation of soil moisture using the variable inltration capacity land surface model, 198093, J. Climate, 14, 17901808, 2001.

Ouarda, T. B. M. J., Cunderlik, J. M., St-Hilaire, A., Barbet,M., Bruneau, P., and Bobe, B.: Data-based comparison of seasonality-based regional ood frequency methods, J. Hydrol., 330, 329339, 2006.

Peterson, T. C.: Climate Change Indices, WMO Bulletin, 54, 8386,2005.

Pierce, D. W., Cayan, D. R., Das, T., Maurer, E. P., Miller, N. L., Bao, Y., Kanamitsu, M., Yoshimura, K., Snyder, M. A., Sloan,L. C., Franco, G., and Tyree, M.: The Key Role of Heavy Precipitation Events in Climate Model Disagreements of Future Annual Precipitation Changes in California, J. Climate, 26, 58795896, 2013.

Prudhomme, C. and Davies, H.: Assessing uncertainties in climate change impact analyses on the river ow regimes in the UK. Part 2: future climate, Climatic Change, 93, 197222, 2008. Richter, B. D., Baumgartner, J. V., Powell, J., and Braun, D. P.: A

Method for Assessing Hydrologic Alteration within Ecosystems, Conserv. Biol., 10, 11631174, 1996.

Rodenhuis, D., Bennett, K., Werner, A., Murdock, T. Q., and

Bronaugh, D.: Hydro-climatology and Future Climate Impacts in British Columbia, revised 2009, Pacic Climate Impacts Consortium, University of Victoria, Victoria, BC, Canada, 2009. Salath, E. P.: Downscaling simulations of future global climate with application to hydrologic modelling, Int. J. Climatol., 25, 419436, 2005.

Salathe, E. P., Mote, P. W., and Wiley, M. W.: Review of scenario selection and downscaling methods for the assessment of climate change impacts on hydrology in the United States pacic northwest, Int. J. Climatol., 27, 16111621, 2007.

Schnorbus, M., Werner, A., and Bennett, K.: Impacts of climate change in three hydrologic regimes in British Columbia, Canada, Hydrol. Process., 28, 11701189, 2014.

Schnorbus, M. A. and Cannon, A. J.: Statistical emulation of streamow projections from a distributed hydrological model: Application to CMIP3 and CMIP5 climate projections for British Columbia, Canada, Water Resour. Res., 50, 89078926, 2014. Shefeld, J., Wood, E. F., and Roderick, M. L.: Little change in global drought over the past 60 years, Nature, 491, 435438, 2012.

Shepard, D. S.: Computer Mapping: The SYMAP Interpolation Algorithm, in: Spatial Statistics and Models, edited By: Gaile, G. L. and Willmott, C. J., Springer Netherlands, series: Theory and Decision Library, 40, 133145, 1984.

Sherwood, S. and Fu, Q.: A Drier Future?, Science, 343, 737739,2014.

Shrestha, R. R., Schnorbus, M. A., Werner, A. T., and Berland, A. J.: Modelling spatial and temporal variability of hydrologic impacts of climate change in the Fraser River basin, British Columbia, Canada, Hydrol. Process., 26, 18401860, 2012.

Shrestha, R. R., Schnorbus, M. A., Werner, A. T., and Zwiers, F. W.: Evaluating Hydroclimatic Change Signals from Statistically and Dynamically Downscaled GCMs and Hydrologic Models, J. Hydrometeorol., 15, 844860, 2014a.

Shrestha, R. R., Peters, D. L., and Schnorbus, M. A.: Evaluating the ability of a hydrologic model to replicate hydro-ecologically relevant indicators, Hydrol. Process., 28, 42944310, 2014b.

www.hydrol-earth-syst-sci.net/20/1483/2016/ Hydrol. Earth Syst. Sci., 20, 14831508, 2016

1508 A. T. Werner and A. J. Cannon: Hydrologic extremes

Sillmann, J., Kharin, V. V., Zhang, X., Zwiers, F. W., and Bronaugh,D.: Climate extremes indices in the CMIP5 multimodel ensemble: Part 1. Model evaluation in the present climate, J. Geophys.Res.-Atmos., 118, 17161733, 2013a.

Sillmann, J., Kharin, V. V., Zwiers, F. W., Zhang, X., and Bronaugh,D.: Climate extremes indices in the CMIP5 multimodel ensemble: Part 2. Future climate projections, J. Geophys. Res.-Atmos., 118, 24732493, 2013b.

Stahl, K., Hisdal, H., Hannaford, J., Tallaksen, L. M., van Lanen,H. A. J., Sauquet, E., Demuth, S., Fendekova, M., and Jdar, J.: Streamow trends in Europe: evidence from a dataset of near-natural catchments, Hydrol. Earth Syst. Sci., 14, 23672382, doi:http://dx.doi.org/10.5194/hess-14-2367-2010

Web End =10.5194/hess-14-2367-2010 http://dx.doi.org/10.5194/hess-14-2367-2010

Web End = , 2010.

Stahl, K., Tallaksen, L. M., Hannaford, J., and van Lanen, H. A. J.: Filling the white space on maps of European runoff trends: estimates from a multi-model ensemble, Hydrol. Earth Syst. Sci., 16, 20352047, doi:http://dx.doi.org/10.5194/hess-16-2035-2012

Web End =10.5194/hess-16-2035-2012 http://dx.doi.org/10.5194/hess-16-2035-2012

Web End = , 2012.

Storch, H. V.: A Remark on Chervin-Schneiders Algorithm to Test Signicance of Climate Experiments with GCMs, J. Atmos.

Sci., 39, 187189, 1982.

Themel, M. J., Gobiet, A., and Heinrich, G.: Empirical-statistical downscaling and error correction of regional climate models and its impact on the climate change signal, Climatic Change, 112, 449468, 2011.

Thrasher, B., Maurer, E. P., McKellar, C., and Duffy, P. B.: Technical Note: Bias correcting climate model simulated daily temperature extremes with quantile mapping, Hydrol. Earth Syst. Sci., 16, 33093314, doi:http://dx.doi.org/10.5194/hess-16-3309-2012

Web End =10.5194/hess-16-3309-2012 http://dx.doi.org/10.5194/hess-16-3309-2012

Web End = , 2012.Uppala, S. M., Kllberg, P. W., Simmons, A. J., et al.: The ERA-40 re-analysis, Q. J. Roy. Meteor. Soc., 131, 29613012, 2005.von Storch, H. and Zwiers, F. W.: Statistical analysis in climate research, Cambridge University Press, Cambridge, UK, 1999.

Wang, T., Hamann, A., Spittlehouse, D. L., and Aitken, S. N.: Development of scale-free climate data for Western Canada for use in resource management, Int. J. Climatol., 26, 383397, 2006.

Werner, A. T.: BCSD Downscaled Transient Climate Projections for Eight Select GCMs over British Columbia, Canada, Pacic Climate Impacts Consortium, University of Victoria, Victoria, BC, Canada, 2011.

Werner, A. T., Schnorbus, M. A., Shrestha, R. R., and Eckstrand,H. D.: Spatial and Temporal Change in the Hydro-Climatology of the Canadian Portion of the Columbia River Basin under Multiple Emissions Scenarios, Atmos. Ocean, 51, 357379, 2013. Werner, A. T., Nienaber, P., Schnorbus, M. A., and Bronaugh, D.: A

Cross Validation of the VIC Forcings Gridded-Observations for British Columbia, Victoria, Pacic Climate Impacts Consortium, University of Victoria, BC, Canada, 2015.

Wilks, D. S.: On Field Signicance and the False Discovery Rate,J. Appl. Meteorol. Clim., 45, 11811189, 2006.

Wood, A. W., Maurer, E. P., Kumar, A., and Lettenmaier,D. P.: Long-range experimental hydrologic forecasting for the eastern United States, J. Geophys. Res., 107, 4429, doi:http://dx.doi.org/10.1029/2001JD000659

Web End =10.1029/2001JD000659 http://dx.doi.org/10.1029/2001JD000659

Web End = , 2002.

Wood, A. W., Leung, L. R., Sridhar, V., and Lettenmaier, D. P.: Hydrologic Implications of Dynamical and Statistical Approaches to Downscaling Climate Model Outputs, Climatic Change, 62, 189216, 2004.

Zhang, X., Alexander, L., Hegerl, G. C., Jones, P., Tank, A. K., Peterson, T. C., Trewin, B., and Zwiers, F. W.: Indices for monitoring changes in extremes based on daily temperature and precipitation data, WIREs Clim. Change, 2, 851870, 2011.

Hydrol. Earth Syst. Sci., 20, 14831508, 2016 www.hydrol-earth-syst-sci.net/20/1483/2016/

Word count: 16954

Show less

Abstract

Translate

Gridded statistical downscaling methods are the main means of preparing climate model data to drive distributed hydrological models. Past work on the validation of climate downscaling methods has focused on temperature and precipitation, with less attention paid to the ultimate outputs from hydrological models. Also, as attention shifts towards projections of extreme events, downscaling comparisons now commonly assess methods in terms of climate extremes, but hydrologic extremes are less well explored. Here, we test the ability of gridded downscaling models to replicate historical properties of climate and hydrologic extremes, as measured in terms of temporal sequencing (i.e. correlation tests) and distributional properties (i.e. tests for equality of probability distributions). Outputs from seven downscaling methods - bias correction constructed analogues (BCCA), double BCCA (DBCCA), BCCA with quantile mapping reordering (BCCAQ), bias correction spatial disaggregation (BCSD), BCSD using minimum/maximum temperature (BCSDX), the climate imprint delta method (CI), and bias corrected CI (BCCI) - are used to drive the Variable Infiltration Capacity (VIC) model over the snow-dominated Peace River basin, British Columbia. Outputs are tested using split-sample validation on 26 climate extremes indices (ClimDEX) and two hydrologic extremes indices (3-day peak flow and 7-day peak flow). To characterize observational uncertainty, four atmospheric reanalyses are used as climate model surrogates and two gridded observational data sets are used as downscaling target data. The skill of the downscaling methods generally depended on reanalysis and gridded observational data set. However, CI failed to reproduce the distribution and BCSD and BCSDX the timing of winter 7-day low-flow events, regardless of reanalysis or observational data set. Overall, DBCCA passed the greatest number of tests for the ClimDEX indices, while BCCAQ, which is designed to more accurately resolve event-scale spatial gradients, passed the greatest number of tests for hydrologic extremes. Non-stationarity in the observational/reanalysis data sets complicated the evaluation of downscaling performance. Comparing temporal homogeneity and trends in climate indices and hydrological model outputs calculated from downscaled reanalyses and gridded observations was useful for diagnosing the reliability of the various historical data sets. We recommend that such analyses be conducted before such data are used to construct future hydro-climatic change scenarios.

Details

Title

Hydrologic extremes - an intercomparison of multiple gridded statistical downscaling methods

Author

Werner, Arelia T; Cannon, Alex J

Pages

1483-1508

Publication year

2016

Publication date

2016

Publisher

Copernicus GmbH

ISSN

10275606

e-ISSN

16077938

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/hess-20-1483-2016

ProQuest document ID

1781742668

Hydrologic extremes - an intercomparison of multiple gridded statistical downscaling methods

Jump to:

Full text

Abstract

Details

Suggested sources