Full text

Turn on search term navigation

Introduction

Hydrological process modelling is an important tool for research on water resource management, flood control and disaster mitigation, water conservancy project planning and design, hydrological response to climate change, and so on (Zanon et al., 2010; Papathanasiou et al., 2015). The initial hydrological model was a black-box model in 1932 (Sherman, 1932) and conceptual and physically based models were subsequently put forward in 1960s (Freeze and Harlan, 1969). The three kinds of hydrological models have been significantly improved in recent years, with their structures becoming more mature. Theoretically, the physically based model has a definite physical mechanism of the water cycle and all parameters can be measured in situ (Abbott et al., 1986; Huang et al., 2014). Conceptual models express hydrological processes in the form of some abstract models which come from some physical phenomenon and experience. For example, the interflow and the base flow are simplified as the flow from linear reservoirs (Caviedes-Voullième et al., 2012; Lü et al., 2013). As a result, some parameters of conceptual models need calibrating. In general, conceptual models have better performance in modelling the streamflow at the catchment outlet than physically based distributed models do, especially for catchments lacking sufficient data (Bao et al., 2010; Cullmann et al., 2011). Thus, many conceptual models such as HBV model, TOPMODEL, Tank model and Xinanjiang model are of strong vitality (Abebe et al., 2010; Vincendon et al., 2010; Hao et al., 2015; Xie et al., 2015). Additionally, the performance of physically based distributed models can be improved after calibration of some parameters (Chen et al., 2016). Therefore, all of the hydrological models should be calibrated before engineering applications.

There are two kinds of calibration methods for hydrological models, the trial–error method and auto-calibration method. The trial–error method depends on plenty of trials for reducing the error of the objective. However, it is difficult to obtain an exact optimal solution due to limited enumeration (Boyle et al., 2000). The auto-calibration method is based on stochastic or mathematical calculations and thus more widely applied in the non-linear parameter optimization. Compared with the trial–error method, it is more efficient and effective, avoiding the interference of anthropogenic factors (Madsen, 2000; Getirana, 2010). The initial automatic optimization methods, such as the Rosenbrock method (Rosenbrock, 1960) and the simplex method (Nelder and Mead, 1965), are classical and useful methods, but at the same time have a negative side of being bounded by initial value ranges of parameters. Therefore, it can only be regarded as local optimization algorithms (Gupta and Sorooshian, 1985). Different from classical methods above, the genetic algorithm (GA), which is designed with random search strategy, can avoid the problem of local search and thus is a global optimization algorithm in its essence (Wang, 1991, 1997; Sedki et al., 2009; Chandwani et al., 2015). After that, many global optimization algorithms have been proposed inheriting the random search strategy. The shuffled complex evolution (SCE-UA) method combines many advantages of the GA and simplex methods, having a powerful capability of calibrating the rainfall–runoff model (Duan et al., 1994; Zhang and Shi, 2011). The particle swarm optimization (PSO) based on random solution can directly obtain the identification parameters through the iterative search for an optimal solution (Kennedy, 1997; Zambrano-Bigiarini and Rojas, 2013). Although the auto-calibration method has been intensively employed to calibrate parameters in the field of hydrology, the most advanced algorithm inevitably falls into local solution because of the strong non-linear problem of a hydrological model and parameter correlation (Chu et al., 2010; Jiang et al., 2010, 2015).

In general, parameter variables follow some specific probability distributions within the given range after multiple independent calibrations (Viola et al., 2009; Jin et al., 2010; Li et al., 2010). Graziani et al. (2008) stated that the shape of a parameter probability distribution can be significantly affected by a parameter range. Touhami et al. (2013) studied the effect of different probability distributions (e.g. normal distribution and uniform distribution) of parameter values on parameter sensitivity, and found that the probability distribution can provide a clue for realizing parameter sensitivity. Although normal and uniform distributions are greatly studied in practice, other types of probability distributions were seldom investigated in previous research (Kucherenko et al., 2012; Esmaeili et al., 2014).

Most hydrological models contain many parameters of different sensitive characteristics and correlation patterns. Some researchers believe that the sensitive parameter should be calibrated, while the insensitive parameter can be set as a fixed value by experience (Beck, 1987; Cheng et al., 2006). Inappropriate parameter ranges or fixed values may result in the instability of calibrated results. Furthermore, the range setting of one parameter may influence the calibration of other related parameters (Song et al., 2015). The model parameter sensitivity analysis has been a growing concern in recent years. Parameter sensitivity varies with catchment characteristics, objective functions, and parameter ranges (van Griensven et al., 2006). Wang et al. (2013) noted the different parameter ranges could lead to changes in parameter sensitivity. Shin et al. (2013) reported that reducing or extending ranges might render insensitive parameters into sensitive ones or vice versa. Thus, parameter ranges and correlation should be taken into consideration when the calibration of multi-parameter models is performed.

Parameter ranges are generally given roughly due to lack of knowledge concerning physical settings of a local catchment (Song et al., 2013; Hao et al., 2015). The more deviation between an optimal range and a given range, the more uncertainty of the calibration result. The selection of appropriate parameter ranges is critical for calibrating the model efficiently. However, there have not been many documented studies on how to select the appropriate parameter range for improving the calibration of hydrological models. Furthermore, the calibration of multiple parameters is more complex due to parameter sensitivity and correlation. Hence, it is necessary to find a way to coordinate the range settings of all parameters.

Considering the effect of parameter ranges on calibration efficiency of hydrological models, an approach of parameter range selection (PRS) is put forward to improve the calibration of hydrological models with multiple parameters. At first, probability distribution of each parameter was analysed based on many independent calibrations by using a GA method. Then the optimal range of a single parameter was specified for calibration according to its probability distribution. Finally, parameter correlation and sensitivity were estimated to determine the optimal combination of multiple parameter ranges. The proposed method is expected to be helpful for an effective and efficient calibration of hydrological models with multiple parameters.

Study area and data collection

The Chaotianhe River catchment is located in the northeast of the Guangxi Zhuang Autonomous Region in southwest China (Fig. 1). The Chaotianhe River is the major tributary of the Lijiang River of a well-known karst landscape. The total catchment area is 476.24 ${km}^{2}$ . The annual precipitation is approximately 1704 $mm$ and 78 % precipitation concentrates in flood seasons (March–August). The thickness of soil varies spatially in most karst areas. Limestone is exposed to air in some peak-cluster regions. Clay soil with thickness ranging from 2 to 10 $m$ is distributed in the depressions and valleys. In clastic rock mountain areas, the thickness of the soil is usually less than 0.5 $m$ . Thus, the soil moisture storage capacity varies significantly with space. Moreover, the underground rivers are very well developed in the karst area, which makes the flood gather rapidly and recess slowly due to higher underground flow rate.

The data concerning daily precipitation, evaporation and streamflow were collected from national gauging stations for the 5-year period of 1996–2000. Four precipitation stations, one streamflow gauging station, and one evaporation station are selected for the investigation. Areal precipitation was calculated using data from the four precipitation stations by using a Thiessen polygon method under GIS environment (Cai et al., 2014). The streamflow gauging station is at the catchment outlet. Some hydro-meteorological statistical data of the studied catchment are summarized in Table 1. From 1996 to 2000, the maximum of daily streamflow was about 719 $m^{3} s^{- 1}$ , the minimum 0.53 $m^{3} s^{- 1}$ , and the average was 13.31 $m^{3} s^{- 1}$ at the outlet. The maximum areal daily precipitation varies with years in the studied catchment and reached the value of 235 $mm d^{- 1}$ in 1996.

Methodology

Hydrological model selection

The method of PRS is designed for most of hydrological models. At present, there have been many hydrological models for hydrological process simulation. Considering the climate characteristics of the study area, the Xinanjiang model, which is suitable for humid regions, was chosen to serve as a hydrological model for the investigation. The Xinanjiang model mainly includes three evapotranspiration layers and three runoff components (i.e. surface-, subsurface runoff and groundwater) (Zhao, 1992). The surface runoff is routed by the Unit Hydrograph (UH) which is derived from the observed streamflow, and other runoff components are simplified as linear reservoirs (Ju et al., 2009). With regard to the Xinanjiang model, there are 10 parameters that should be calibrated. The definitions of the parameters are given in Table 2 (Lin et al., 2014; Hao et al., 2015). The proposed PRS method is introduced as follows, when a Xinanjiang model is taken as an example.

Metro-hydrological statistical data of the study area.

Year	$Q_{Max}$	$Q_{Min}$	$Q_{Avg}$	$P_{Max}$
	( $m^{3} s^{- 1}$ )			( $mm d^{- 1}$ )
1996	719	0.76	14.38	235
1997	308	0.76	14.32	155
1998	369	0.66	13.67	157
1999	282	0.53	12.81	144
2000	339	1.14	11.37	107

$Q_{Max}$ , $Q_{Min}$ , and $Q_{Avg}$ mean the maximum, minimum, and average value of daily streamflow, respectively, and $P_{Max}$ means the maximum value of daily precipitation.

Location of the study area.

[Figure omitted. See PDF]

Probability distribution analysis of calibrated parameter value

Sample collection of calibrated parameter value

In theory, the parameter values calibrated by using a stochastic-based auto-calibration method are not the same as each other but follow a specific probability distribution under a reasonable convergence condition (Jiang et al., 2015). The stochastic-based auto-calibration is used to calibrate the model, and samples of calibrated parameter values are obtained in order to analyse the probability distribution of parameter values. The sample size of 100 is adequate for estimating the probability distribution of calibrated parameter values in the investigation, which is deduced from the results of trial tests as shown in Fig. 2. It can be seen that both maximum and minimum $E_{NS}$ keep stable when sample size is greater than 100.

Parameters of Xinanjiang model.

Parameter	Definition	Units
CI	Recession constants of the lower interflow storage	dimensionless
Kc	Ratio of potential evapotranspiration to pan evaporation	dimensionless
KI	Outflow coefficients of the free water storage to interflow	dimensionless
SM	Areal mean free water capacity of the surface soil layer, which represents
	the maximum possible deficit of free water storage	mm
$B$	Exponential parameter with a single parabolic curve, which represents
	the non-uniformity of the spatial	dimensionless
WM	Averaged soil moisture storage capacity of the whole layer	mm
$C$	Coefficient of the deep layer, which depends on the proportion
	of the basin area covered by vegetation with deep roots	dimensionless
EX	Exponent of the free water capacity curve influencing the development of the saturated area	dimensionless
CG	Recession constants of the groundwater storage relationships	dimensionless
KG $^{*}$	Outflow coefficients of the free water storage to groundwater relationships	dimensionless
Im	Percentage of impervious and saturated areas in the catchment	dimensionless

$^{*}$ The value of KG is calculated by the function 0.7-KI.

Variation curves of maximum and minimum $E_{NS}$ with sample sizes.

[Figure omitted. See PDF]

A GA was selected as the auto-calibration method in the investigation, because GA is a common and widely used global optimization algorithm based on stochastic and evolutionary optimization. Many studies show that evolutionary algorithms provide equal or better performance of a model than other algorithms do (Cooper et al., 1997; Jha et al., 2006; Zhang et al., 2009). The Nash–Sutcliffe efficiency ( $E_{NS}$ ) was chosen as an objective function (Eq. 1) for GA, which represents the agreement between observed and simulated data. $E_{NS} = 1 - \frac{\sum_{i = 1}^{n} (Q_{obs, i} - Q_{sim, i})^{2}}{\sum_{i = 1}^{n} (Q_{obs, i} - Q_{mean})^{2}},$ where $E_{NS}$ is Nash–Sutcliffe efficiency, $i$ is the serial number of the step, $n$ is the total number of the observed streamflow data, $Q_{obs, i}$ is the observed streamflow at step $i$ , $Q_{sim, i}$ is the simulated streamflow at step $i$ , and $Q_{mean}$ is the mean value of observed streamflow.

Determination of probability distribution types

The probability distributions of calibrated parameter values can be estimated roughly by using box-plot charts, cumulative frequency curves, and frequency histograms. The symmetry of the box-plot chart (including one box and two whiskers) and the length ratio of the whisker to the box, the shape of the cumulative frequency curve, and the frequency histogram are important indicators for the identification of the distribution type. Based on these indicators, three types of probability distributions are listed as follows: (1) normal distributions, where the box and whiskers are approximately symmetrical along the $y$ -axis direction, the length of either whisker is longer than half height of the box in a box-plot chart (Fig. 3a), and the cumulative frequency curve is S-shaped and the histogram bell-shaped (Fig. 3b); (2) exponential distributions, where the whole chart is distinctly asymmetrical in the $y$ -axis direction, which means that the average value (marked with a small hollow square) deviates from the median value (marked with a centre line in box), where the box is inclined to one side with the extreme shorter whisker (Fig. 3a), the cumulative frequency curve is parabola shaped, and the histogram tends to increase or decline gradually (Fig. 3c); (3) uniform distribution, where the box and whiskers are approximately symmetrical along the $y$ -axis direction, the length of two whiskers is close to that of the box (Fig. 3a), the cumulative frequency curve tends to a straight line and the histogram varies little along the $x$ axis (Fig. 3d).

Different probability distribution types of calibrated parameter values. (a) Box-plot charts of normal, exponential, and uniform distribution; cumulative frequency curve and histogram for normal (b), exponential (c), and uniform (d) distributions.

[Figure omitted. See PDF]

A Kolmogorov–Smirnov test (K–S test) is geared to examine whether a data set fits a reference probability distribution or not (Haktanir, 1991). In a K–S test, for any variable $x_{i}$ in a data set, the empirical distribution function value ( $F_{i}$ ) is calculated by using a plotting position formula, and the cumulative distribution function value ( $F_{i}^{*}$ ) is computed by using the reference probability distribution. The maximum deviation between the two values, $Δ_{Max}$ , is expressed in Eq. (2). $Δ_{Max} = |F_{i}^{*} - F_{i}|$ According to the acceptable level of significance $α$ ( $α = 0.2 α$ ) and the total number of values in a data set $n$ , $Δ_{table}$ can be obtained from the K–S table. If $Δ_{Max} < Δ_{table}$ , the reference probability distribution is identified to fit to the data set.

Parameter range selections

Single parameter range selection (S-PRS)

In order to improve $E_{NS}$ , the initial range of a parameter requires adjusting properly. Given the three probability distribution types mentioned above, the different ways to specify the optimal range for a single parameter are presented in the investigation. For the parameter of a uniform distribution, it is better to keep the initial range due to the weak influence of ranges on calibration results. For the parameter of a normal distribution, the cumulative frequency curve is employed to seek some reduced ranges with a given cumulative frequency (e.g. 50 %), and the minimum and maximum ranges (namely MINR and MAXR) are obtained as depicted in Fig. 4. The MINR and MAXR represent the ranges of maximum and minimum probability density of parameter values under a given cumulative frequency. As for the parameter of an exponential distribution, the initial range can be extended appropriately towards one side of high probability density, if the parameter has reasonable meaning in the extended range. Then, the optimal range of the parameter can be specified by comparing different $E_{NS}$ calculated separately by using the initial range, the MINR or MAXR of the initial range, or the MINR or MAXR of the extended range. If the initial range cannot be extended, the MINR and MAXR are sought out according to the cumulative frequency curve. Figure 5 gives the variation curves of maximum and minimum $E_{NS}$ of a single parameter with cumulative frequency values. It is found that the maximum $E_{NS}$ remains constant despite a cumulative frequency value varying, while the minimum $E_{NS}$ approaches the peak value of 0.881 when the cumulative frequency value is equal to 50 %. Considering that higher minimum $E_{NS}$ contributes to more efficient calibration, the fixed cumulative frequency value of 50 % was selected to determine the ranges of maximum and minimum probability density (i.e. MINR and MAXR) for each parameter. In short, the optimal range of a single parameter can be determined by properly extending or reducing the initial range to make calibrated parameter values distributed quite closely to a uniform distribution.

Selection of minimum and maximum range (MINR and MAXR) with a cumulative frequency of 50 %.

[Figure omitted. See PDF]

Variation curves of maximum and minimum $E_{NS}$ of a single parameter with cumulative frequency values.

[Figure omitted. See PDF]

Multiple parameter range selection (M-PRS)

In general, there is more or less correlation between parameters for most hydrological models. As far as a Xinanjiang model is concerned, parameters WM and $B$ refer to the water storage volume–area curve that represents the spatial variability of soil moisture storage. If the curve is fixed, a larger WM results in a smaller $B$ (Zhao, 1992). As a result, the range change of parameter WM may affect the range setting and calibration of parameter $B$ . If the ranges of the related parameters required adjusting, the correlations among parameters, therefore, should be taken into account. If the range change of one parameter has positive influence on calibration of other parameters, using the optimal range of the parameter instead of the initial one can contribute to better calibration results. On the contrary, the negative impact may result in a worse model calibration, even though the optimal ranges of the parameters are used. Thus, some coordination measures should be taken to deal with such a contradiction. The index $R_{C}$ (Eq. 3) was quantified to analyse the influence degree of one-parameter range change on the calibration of other parameters. When $R_{C Y, X}$ is closer to 1, the range change of parameter $X$ has a greater positive influence on the calibration of parameter $Y$ . If $R_{C Y, X}$ is minus, it exerts a negative influence. $R_{C Y, X} = 1 - \frac{L_{Y, X} - L_{Y, Y}}{L_{Y, Initial} - L_{Y, Y}},$ where $R_{C Y, X}$ is the influence degree of the range change of parameter $X$ on the calibration of parameter $Y$ , $L_{Y, X}$ is the range of parameter $Y$ calibrated with the optimal range of parameter $X$ and initial ranges of other parameters, $L_{Y, Y}$ is the range of parameter $Y$ calibrated with the optimal range of parameter $Y$ and initial ranges of other parameters, and $L_{Y, Initial}$ is the range of parameter $Y$ calibrated with initial ranges of all parameters. The calibrated range of any parameter is calculated, excepting extreme outliers.

If there is a negative influence between two parameters, the optimal range of the parameter of higher sensitivity is used and the initial range of the other parameter is kept for calibration generally to mitigate the negative impact. It is due to the fact that sensitive parameters play more important roles than insensitive parameters do in a multi-parameter calibration. In order to assess the sensitivity of parameter range change to $E_{NS}$ , index $S_{E}$ as expressed in Eq. (4) is computed by performing an S-PRS method on each parameter. The larger the value of $R_{C}$ , the more concentrated the distribution of $E_{NS}$ , which means more efficient parameter calibration. Thus, the parameter of higher $S_{E}$ is given priority to the optimal range when the $R_{C}$ of two parameters is minus. $S_{E} = 1 - \frac{{E^{'}}_{NS Max} - {E^{'}}_{NS Min}}{E_{NS Max} - E_{NS Min}},$ where $S_{E}$ is sensitivity of parameter range change to $E_{NS}$ , $E_{NS Max}$ and $E_{NS Min}$ are maximum and minimum $E_{NS}$ calibrated with an initial range, and ${E^{'}}_{NS Max}$ and ${E^{'}}_{NS Min}$ are maximum and minimum $E_{NS}$ calibrated with an optimal range. The statistical analysis of $E_{NS}$ excludes extreme outliers.

Given the fact that there are more than two parameters in most hydrological models, the accumulative influence and the coordination of range selection were investigated in the study. The mean value of $R_{C}$ ( $R_{C mean}$ ) is the index to judge the accumulative influence of one-parameter range change on the calibration of other parameters. Thus, for parameters of a negative $R_{C mean}$ , the initial ranges instead of the optimal ones are adopted for the multi-parameter calibration.

The flow chart of multiple parameter range selections.

[Figure omitted. See PDF]

Range changes and K–S tests ( $α = 0.2$ ) of parameters in the initial schema.

Parameter	Initial range	Calibrated range $^{a}$	Ratio $^{b}$ ( %)	$Δ_{Max}$ $^{c}$
				Normal	Exponential	Uniform
				distribution	distribution	distribution
CI	0–0.9	0.630–0.745	12.78	0.062 (pass)	0.328 (fail)	0.115 (fail)
Kc	0–1.1	0.81–1.09	25.45	0.076 (pass)	0.305 (fail)	0.089 (pass)
KI	0–0.7	0.534–0.7	23.71	0.128 (fail)	0.076 (pass)	0.173 (fail)
SM	10–50	31–39.4	21.00	0.060 (pass)	0.304 (fail)	0.110 (fail)
$B$	0.1–0.4	0.238–0.4	54.00	0.180 (fail)	0.062 (pass)	0.203 (fail)
WM	120–200	120–150	37.50	0.181 (fail)	0.072 (pass)	0.231 (fail)
$C$	0.1–0.2	0.1–0.2	100.00	0.163 (fail)	0.082 (pass)	0.217 (fail)
EX	1.0–1.5	1.0–1.5	100.00	0.118 (fail)	0.079 (pass)	0.135 (fail)
CG	0.950–0.998	0.950–0.994	91.67	0.123 (fail)	0.102 (pass)	0.139 (fail)
Im	0.01–0.04	0.01–0.04	100.00	0.134 (fail)	0.076 (pass)	0.148 (fail)

$^{a}$ The calibrated parameter range except the extreme outlier. $^{b}$ The ratio is calculated by dividing the length of the range derived from 100 GA calibration runs by the initial range length. $^{c}$ The $Δ_{Max}$ is calculated by using the normalized parameter values.

The flow chart of the PRS method is shown in Fig. 6. In stage 1, a set of initial ranges of parameters are given for a hydrological model and the probability distribution for each parameter analysed based on the 100 independent parameters values calibrated by an auto-calibration method. In stage 2, there are three range adjustment methods with response to a probability distribution of parameter values: for a normal distribution, the optimal range of a single parameter is obtained by reducing the initial range; for an exponential distribution, the initial range of a single parameter is extended to specify the optimal range, or the initial range is reduced to seek the optimal range for calibration when the extension of the parameter range is limited; for a uniform distribution, the initial range is kept. In stage 3, the single-parameter range selection (S-PRS) is performed on each parameter. Based on the indexes $S_{E}$ and $R_{C}$ estimated, the optimal combination of ranges is determined by coordinating the range selection of all parameters.

Results and discussion

Probability distribution characteristics of calibrated parameter values of the Xinanjiang model

A series of calibrated parameters values were obtained through 100 independent calibration runs by using a GA method. Trial tests were employed to determine the optimal GA control parameters: crossover probability of 0.5, mutation probability of 0.7 for the individual, mutation probability of 0.5 for each gene, population size of 21, maximum generation number of 500, and maximum iteration number of 50. These parameters were kept constant for GA calibrations in the investigation. The initial and calibrated ranges of parameters are presented in Table 3. The ratio of the calibrated range length to the initial one in Table 3 is less than 60 % for most parameters (i.e. parameter CI, Kc, KI, SM, $B$ , and WM), which implies that reducing the ranges can help calibrate most parameters efficiently. For any particular parameter, calibrated values were normalized by dividing a deviation between a calibrated value and the lower limit of the initial range by the length of the initial range. Based on 100 calibrated values after normalization, a box plot for a parameter is depicted. It is obvious from Fig. 7 that the box and whiskers are approximately symmetrical and the length of whiskers is longer than that of half the box along the direction of the $y$ axis for parameters CI, SM, and Kc. But for other parameters, it is shown from the box-plot charts that the mean value deviates from the median one, which means an asymmetric chart. According to the characteristics of the box plots, the probability distributions of the calibrated values are normal for parameters CI, SM, and Kc, while those are exponential for other parameters. Furthermore, K–S tests

The box-plot chart of normalized calibrated values for parameters of Xinanjiang model.

[Figure omitted. See PDF]

were employed to determine the probability distributions of parameters and the corresponding results are listed in Table 3. It is shown that only a normal distribution is accepted for parameters CI and SM. Despite the fact that both normal and uniform distributions are accepted for parameter KC, the probability distribution of parameter KC is regarded as a normal distribution. It is because the $Δ_{Max}$ will become smaller if a normal distribution serves as a reference distribution instead of a uniform distribution. In addition, just an exponential distribution is accepted for the rest of the parameters. Thus, these three parameters follow normal distributions and the others exponential distributions in the Xinanjiang model. The ratio of the calibrated range length to the initial range length is less than 30 % for parameters CI, SM, and Kc, while the ratio exceeds 30 % for parameters $B$ , WM, $C$ , EX, CG, and Im. It indicates that reducing the initial ranges can improve the calibration for parameters whose values observe normal distributions.

Results of range selection of parameter CI. Probability distribution of parameter values for schema initial range (a), CI-MINR (b) and CI-MAXR (c); (d) box-plot chart of $E_{NS}$ for three schemas.

[Figure omitted. See PDF]

Effect of range adjustment pattern on calibration results

Since the probability distribution of a single parameter has a direct relation with the PRS, the range adjustment pattern of a single parameter was discussed on the basis of the parameter probability distribution in the investigation.

For a normal distribution, the range was reduced to find the optimal range. Figure 8 shows the calibration results of parameter CI when the different ranges are selected. The MINR (0.679–0.713) and the MAXR (0.623–0.694) were picked out based on the cumulative frequency curve derived from calibrations with the initial range (0–0.900). From the cumulative curves and the histograms in Fig. 8a, b, and c, it is found that the probability distribution of parameter CI values is converted from a normal distribution to a uniform distribution when the initial range is reduced to the MINR, whereas the probability distribution approximates an exponential one when the MAXR is used. Figure 8d reveals the contribution of the PRS to $E_{NS}$ . It is found that the minimum $E_{NS}$ , excepting extreme outliers, rises from 0.881 to 0.884 and the $E_{NS}$ concentrates at a higher value range, when the MINR is used instead of the initial range. Using the reduced range of high probability density is, therefore, helpful to make calibration more stable and more efficient.

Results of range selection of parameter KI. Probability distribution of parameter values for schema initial range (a), KI-MINR (b), and KI-MAXR (c); (d) box-plot chart of $E_{NS}$ for three schemas.

[Figure omitted. See PDF]

Results of range selection of parameter $B$ . Probability distribution for schema initial range (a), $B$ -Extension (b), $B$ -MINR (c), and $B$ -Extension-MINR (d); (e) box-plot chart of $E_{NS}$ for four schemas.

[Figure omitted. See PDF]

The indexed $R_{C}$ and $S_{E}$ of parameters when the optimal range of each parameter is used for calibration.

Parameter $^{*}$		CI	Kc	KI	SM	$B$	WM	$C$	EX	CG	Im
Optimal range of		0.679–0.713	0.95–1.05	0.66–0.7	35–39	0.379–0.488	105–110	0.175–0.2	1–1.118	0.95–0.966	0.01–0.0245
a single parameter
$R_{C}$	CI	1.000	0.334	0.371	0.462	0.322	0.113	0.105	0.115	$-$ 0.128	0.272
	Kc	0.689	1.000	0.467	0.429	0.504	0.503	0.389	0.102	0.284	0.150
	KI	0.778	0.315	1.000	0.445	0.574	0.268	0.456	0.328	0.060	0.258
	SM	0.508	$-$ 0.199	0.422	1.000	$-$ 0.089	0.009	$-$ 0.063	0.383	0.218	$-$ 0.032
	$B$	0.914	0.560	0.698	$-$ 0.017	1.000	0.972	$-$ 0.175	0.007	$-$ 0.319	$-$ 0.722
	WM	0.575	0.311	0.439	0.553	0.325	1.000	0.229	0.360	$-$ 0.069	$-$ 0.235
	$C$	0.208	0.273	0.083	0.151	0.277	0.335	1.000	0.077	0.200	0.210
	EX	0.054	0.047	$-$ 0.011	0.018	0.371	0.045	0.009	1.000	$-$ 0.021	$-$ 0.025
	CG	0.221	0.246	$-$ 0.135	0.022	0.010	0.198	$-$ 0.034	$-$ 0.009	1.000	$-$ 0.112
	Im	0.238	0.073	$-$ 0.025	0.045	0.031	0.030	$-$ 0.026	$-$ 0.020	0.001	1.000
Mean of $R_{C}$		0.465	0.218	0.257	0.234	0.258	0.275	0.099	0.149	0.025	$-$ 0.026
$S_{E}$ (%)		54.7	47.9	36.6	41.7	48.1	39.9	10.8	14.7	21.9	0.3

$^{*}$ The parameter represents parameter $X$ in Eq. (2).

To an exponential distribution, both reduced ranges and extended ranges of reasonable meaning were used to select the optimal range for parameter calibration. Figure 9 shows the calibration results under three different input ranges of parameter KI. Since the initial range of parameter KI cannot be extended, the two reduced ranges (i.e. the MINR, 0.660–0.700, and the MAXR, 0.522–0.660) were picked out according to the cumulative frequency curve. From the cumulative curves and the histograms in Fig. 9a, b, and c, it is found that the probability distribution of parameter KI values is similar to a uniform distribution in the case of the MINR, whereas that is still exponential in the case of the MAXR. The contributions of the three parameter ranges to $E_{NS}$ are shown in Fig. 9d. Thus, the MINR is best for calibration of parameter KI when compared with the MAXR or the initial range, which is similar to the calibration result of parameter CI. In general, the MINR is better than the MAXR for parameter calibration, because the parameter values that may achieve a higher $E_{NS}$ can be easily picked out from the MINR of higher probability density.

Figure 10 shows the calibration results of parameter $B$ whose initial range can be extended. Parameter $B$ generally ranges from 0.1 to 0.4 for most areas, but it is quite different for karst areas where the soil moisture storage varies remarkably with space. As a result, the value of parameter $B$ can be greater than 0.4. From Fig. 10a and b, it is shown that the probability distribution of parameter $B$ is converted from an exponential distribution to a normal distribution when the initial range is extended to new one ( $B =$ 0.1–0.6). After the MINR selection was performed on the initial range and the extended range, the two ranges, i.e. the MINR ( $B =$ 0.36–0.40) and the extension-MINR ( $B =$ 0.379–0.488), were obtained and then used to calibrate parameter $B$ . From Fig. 10c and d, it is found that the probability distribution of parameter $B$ approximates a uniform distribution when the MINR or the extension-MINR is used. The box plots of $E_{NS}$ for different ranges are shown in Fig. 10e. It is shown that there is little improvement in maximum $E_{NS}$ when MINR is used for calibration instead of the initial range. There is an increase of 0.0003 in maximum $E_{NS}$ if the initial range is replaced with the extension range or the extension-MINR. As for minimum $E_{NS}$ (except outliers), an increase of 0.001 in the case of the MINR, a decrease of 0.003 in the case of the extension range, and an increase of 0.003 in the case of the extension-MINR are found when they replace the initial range. It suggests that an appropriate range extension followed by a MINR selection is helpful to improve calibration for the parameter whose probability distribution is exponential and initial range can be extended.

Effect of multiple parameter range combination on calibration results

The S-PRS method was employed to determine the optimal range for each parameter. According to the optimal ranges and the corresponding initial ranges, indexed $R_{C}$ and $S_{E}$ were quantified to understand parameter correlation and sensitivity. It is obvious from Table 4 that $R_{C}$ values in the columns of parameters CI and WM are positive, but most $R_{C}$ values in the column of parameter Im are negative. The negative $R_{C}$ related to two parameters means that using the optimal range of one parameter is adverse to calibrating the other. Both $R_{C EX, Im}$ and $R_{C Im, EX}$ are negative in spite of small values, indicating that using the optimal ranges of parameters EX and Im simultaneously is not conducive to calibrating these two parameters. The mean of $R_{C}$ ( $R_{C mean}$ ) varies with parameters. Parameter CI has the maximum $R_{C mean}$ of 0.465 and parameter Im the minimum $R_{C mean}$ of $-$ 0.026. Furthermore, all parameters have positive $R_{C mean}$ values except for parameter Im, owing to the accumulative negative correlation between parameter Im and the others.

To coordinate with negatively related parameters, the index $S_{E}$ was used to pick out parameters of higher sensitivity to $E_{NS}$ . From Table 4, it is found that parameter CI has the maximum $S_{E}$ of 54.7 % and parameter Im the minimum $S_{E}$ of 0.3 %. Most $S_{E}$ values are more than 20 % except those of parameters $C$ , EX, and Im. It suggests that parameters $C$ , EX, and Im are of low sensitivity to $E_{NS}$ and the others highly sensitive to $E_{NS}$ . Parameter CI is the most sensitive while Im is the most insensitive, which agrees with the work of Lü et al. (2013) and Song et al. (2013). For the well-developed karst areas, the thin layer of soil and strong permeability of limestone make rainfall easy to penetrate into the ground. Moreover, the existence of karst caves and subsurface streams contribute to great interflow storage which accounts for a large proportion of streamflow. As a result, the calibration of parameters KI (representing the penetrability of free water to interflow) and CI (representing recession capacity of interflow storage) has a significant influence on rainfall-runoff simulation results. Hence, parameters KI and CI are highly sensitive in the investigation. Thus, the optimal ranges of parameters of higher sensitivity should be used to improve calibration.

Parameter range setting for different cases.

Case	Range setting of parameter
	CI	Kc	KI	SM	$B$	WM	$C$	EX	CG	Im
1	I	I	I	I	I	I	I	I	I	I
2	I	I	I	I	I	I	I	I	I	O
3	I	I	I	I	I	I	I	O	I	I
4	I	I	I	I	I	I	I	O	I	O
5	O	O	O	O	O	O	O	O	I	I
6	O	O	O	O	O	O	O	O	O	I
7	O	O	O	O	O	O	O	O	O	O

The symbol “I” represents the initial range of the parameter in Table 3, and “O” the optimal range of the parameter in Table 4.

In order to determine the optimal range combination of multiple parameters, seven cases were investigated with different range combinations of parameters (Table 5). Case 1 was defined as the initial case using all initial ranges. Cases 2–4 were defined as the single parameter range selection (S-PRS) cases. Cases 5–7 were set as the multiple parameter range selection (M-PRS) cases. The box plots of $E_{NS}$ for different cases are given in Fig. 11. There is a small decrease in $E_{NS}$ when Case 4 is separately compared with Cases 1–3. It can be explained that both $R_{C EX, Im}$ and $R_{C Im, EX}$ are negative and the combination of the optimal ranges corresponding to the two parameters leads to a worse calibration result. As the $S_{E}$ value of parameter Im is less than that of parameter EX, parameter EX is given priority to use the optimal range. It is the reason why the calibration result of Case 3 is better than that of Case 2. As for the cases with the multi-parameter range selection (i.e. Cases 5–7), the $E_{NS}$ values are more robust than those of Cases 1–4. There is an increase of approximately 0.001 in maximum $E_{NS}$ and an increase of approximately 0.01 in minimum $E_{NS}$ when the multi-parameter range selection is performed. There are some differences in $E_{NS}$ with the comparison between Cases 5–7 in a magnified box-plot chart. Case 6 has the most concentrated values of $E_{NS}$ and the largest mean value of $E_{NS}$ among the three cases. It means that the combination of optimal ranges of all parameters (see Case 7) is not the optimum to calibrate a multi-parameter model inasmuch as some parameters like Im have negative correlation on other parameters. Hence, the initial ranges of parameters having negative mean values of $R_{C}$ and low values of $S_{E}$ are supposed to be used to calibrate parameters instead of the corresponding optimal ranges.

The box-plot chart of $E_{NS}$ for different cases.

[Figure omitted. See PDF]

The variation curves of maximum and minimum $E_{NS}$ with number of runs by using a GA method and a proposed PRS method.

[Figure omitted. See PDF]

Through a calibration run, a set of calibrated values of all parameters and the corresponding $E_{NS}$ are obtained. Figure 12 shows the variation curves of maximum and minimum values of $E_{NS}$ with number of runs by using a GA method and a proposed PRS method. It is indicated from Fig. 12 that no matter if it is maximum or minimum $E_{NS}$ , the PRS-based value is essentially the same as the GA-based one when the number of runs does not exceed 100. It is because the PRS method initially needs 100 runs of GA calibration to obtain parameter value samples for selecting the optimal ranges. If a proposed method is used for calibration instead of a GA method, there is an increase of approximately 0.001 in maximum $E_{NS}$ and an increase of approximately 0.01 in minimum $E_{NS}$ when the number of runs is greater than 100. Thus, for any particular run number, the value of $E_{NS}$ calculated by using a PRS method is not less than that by using a GA method. Additionally, it is found from the investigation that there is no significant difference in computational time between the two methodologies. The application of a proposed method, therefore, contributes to a relatively efficient calibration.

Conclusions

Considering that there is a relation between the selection of multi-parameter ranges and the calibration effect of a hydrological model, an approach to determine an optimal combination of ranges for the multi-parameter calibration was put forward by analysing parameter probability distribution, parameter sensitivity, and correlation between parameters. The newly proposed method was applied for the calibration of a Xinanjiang model for karst areas, and some findings are presented as follows.

In the Xinanjiang model, parameters CI, Kc, SM, and $B$ approximately obey normal probability distributions and parameters WM, $C$ , EX, KI, CG, and Im obey exponential probability distributions, both after 100 independent calibration runs. For the parameters of a normal distribution, the MINR defined by using a cumulative frequency curve of calibrated values is preferred to be selected as the optimal parameter range for calibration. For the parameters of an exponential distribution, the extension-MINR is recommended to be used for calibration if the initial range can be extended towards the high-probability side; otherwise the MINR is selected as the optimal range for calibration.

The proposed PRS method improves the minimum and mean values of $E_{NS}$ . The application of the proposed methodology results in an increase of 0.01 in minimum $E_{NS}$ compared with that of the pure GA method. The rising of minimum $E_{NS}$ with little change of the maximum may shrink the range of the possible solutions. As a result, the uncertainty of the model performance can be effectively controlled.

The M-PRS method is superior to the S-PRS one for calibrating hydrologic models with multiple parameters. The $R_{C}$ and $S_{E}$ are two important indexes that can help to analyse the sensitivity and correlation between parameters and consequently to coordinate with the negatively related parameters. The initial ranges of parameters of relatively low $S_{E}$ and negative $R_{C mean}$ and the optimal ranges of parameters of positive $R_{C mean}$ should be preferred to be chosen for the multi-parameter model calibration.

Data availability

Please contact the corresponding author to access the data in this study.

The authors declare that they have no conflict of interest.

Acknowledgements

The investigation is supported by the Non-profit Industry Financial Program of Ministry of Water Resources of China (no. 201401057), NFSC (no. 91225301), and the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry (no. 2013-1792). Edited by: G. Di Baldassarre Reviewed by: two anonymous referees

Word count: 6627

Show less

© 2017. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The parameters of hydrological models are usually calibrated to achieve good performance, owing to the highly non-linear problem of hydrology process modelling. However, parameter calibration efficiency has a direct relation with parameter range. Furthermore, parameter range selection is affected by probability distribution of parameter values, parameter sensitivity, and correlation. A newly proposed method is employed to determine the optimal combination of multi-parameter ranges for improving the calibration of hydrological models. At first, the probability distribution was specified for each parameter of the model based on genetic algorithm (GA) calibration. Then, several ranges were selected for each parameter according to the corresponding probability distribution, and subsequently the optimal range was determined by comparing the model results calibrated with the different selected ranges. Next, parameter correlation and sensibility were evaluated by quantifying two indexes, $R_{C Y, X}$ and $S_{E}$ , which can be used to coordinate with the negatively correlated parameters to specify the optimal combination of ranges of all parameters for calibrating models. It is shown from the investigation that the probability distribution of calibrated values of any particular parameter in a Xinanjiang model approaches a normal or exponential distribution. The multi-parameter optimal range selection method is superior to the single-parameter one for calibrating hydrological models with multiple parameters. The combination of optimal ranges of all parameters is not the optimum inasmuch as some parameters have negative effects on other parameters. The application of the proposed methodology gives rise to an increase of 0.01 in minimum Nash–Sutcliffe efficiency ( $E_{NS}$ ) compared with that of the pure GA method. The rising of minimum $E_{NS}$ with little change of the maximum may shrink the range of the possible solutions, which can effectively reduce uncertainty of the model performance.

Details

Title

Improvement of hydrological model calibration by selecting multiple parameter ranges

Author

Wu, Qiaofeng¹; Liu, Shuguang²; Cai, Yi²; Li, Xinjian³; Jiang, Yangming⁴

¹ Department of Hydraulic Engineering, College of Civil Engineering, Tongji University, Shanghai, 200092, China
² Department of Hydraulic Engineering, College of Civil Engineering, Tongji University, Shanghai, 200092, China; Key Laboratory of Yangtze River Water Environment, Ministry of Education, Shanghai, 200092, China
³ Guangxi Zhuang Autonomous Region Center Station of Irrigation Experiment, Guilin, 541105, China
⁴ Hydrology & Water Resources Bureau of Guilin, Guangxi Zhuang Autonomous Region, Guilin, 541001, China

Pages

393-407

Publication year

2017

Publication date

2017

Publisher

Copernicus GmbH

ISSN

10275606

e-ISSN

16077938

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/hess-21-393-2017

ProQuest document ID

2414067873

Improvement of hydrological model calibration by selecting multiple parameter ranges

Jump to:

Full text

Abstract

Details

Suggested sources

Case	Range setting of parameter
	CI	Kc	KI	SM	$B$	WM	$C$	EX	CG	Im
1	I	I	I	I	I	I	I	I	I	I
2	I	I	I	I	I	I	I	I	I	O
3	I	I	I	I	I	I	I	O	I	I
4	I	I	I	I	I	I	I	O	I	O
5	O	O	O	O	O	O	O	O	I	I
6	O	O	O	O	O	O	O	O	O	I
7	O	O	O	O	O	O	O	O	O	O

Case	Range setting of parameter
	CI	Kc	KI	SM	$B$	WM	$C$	EX	CG	Im
1	I	I	I	I	I	I	I	I	I	I
2	I	I	I	I	I	I	I	I	I	O
3	I	I	I	I	I	I	I	O	I	I
4	I	I	I	I	I	I	I	O	I	O
5	O	O	O	O	O	O	O	O	I	I
6	O	O	O	O	O	O	O	O	O	I
7	O	O	O	O	O	O	O	O	O	O

Case	Range setting of parameter
	CI	Kc	KI	SM	$B$	WM	$C$	EX	CG	Im
1	I	I	I	I	I	I	I	I	I	I
2	I	I	I	I	I	I	I	I	I	O
3	I	I	I	I	I	I	I	O	I	I
4	I	I	I	I	I	I	I	O	I	O
5	O	O	O	O	O	O	O	O	I	I
6	O	O	O	O	O	O	O	O	O	I
7	O	O	O	O	O	O	O	O	O	O