Introduction
Flood estimation is important for design and safety assessments, flood risk management and spatial planning. It aims to assess the probability of occurrence of large events, e.g., discharges with return periods of 100 to 10 000 years. Estimation of events with such low probability is particularly arduous. It can only be based on a few data points representing the most extreme events in a time series of a limited length. Thus extrapolation to long return periods is usually needed. In dam safety analyses, for example, return period estimations of to years are often used . Methods for deriving such estimations can be classified into two main groups: statistical flood frequency analysis and precipitation–runoff modeling. Statistical flood frequency analysis is based on the analysis of an observed streamflow record for which the return periods of the highest events are modeled using extreme value theory, and magnitudes with longer return periods are estimated using the fitted statistical model. A drawback of this method is that it relies on local or regional streamflow data and is likely to be very sensitive to the density of observations (for the regional case) and to the type of distribution chosen . Furthermore, heavy rainfall is a major factor driving the occurrence of flooding, even in areas where snowmelt also plays a significant role, such as in Norway. Rainfall series are generally more abundant, often have longer periods of record, and they usually show stronger regional consistency. This observation is one of the main motivations of the GRADEX method which uses the distribution of rainfall to extrapolate the distribution of discharge. This has further led to the development of rainfall–runoff simulation methods for extreme flood estimation. The idea is to extend the database of streamflow by converting rainfall into surface runoff using a model of the catchment response. Input rainfall may be either observed or synthetic events with an estimated probability of occurrence (event-based method) or, either historical or synthetic rainfall records for generating a continuous streamflow series (continuous simulation approach).
In Norway, a simple event-based rainfall–runoff model, PQRUT, has been used
since the 1980s as a simulation method for dam safety analyses for which the
magnitude of low frequency events (e.g., 500-, 1000-year peak inflow) and the
probable maximum flood are required. Recently, a semi-continuous model,
SCHADEX has been tested as an alternative approach
for obtaining such estimates. SCHADEX has been developed and applied in
France by Electricité de France (EDF) for dam spillway design since 2006.
It has also recently been applied in different regions of the world (in
France, Austria, Canada and Norway)
Data
Daily data from 368 precipitation stations in Norway were extracted from the European Climate Assessment and Dataset (ECA&D), a database of daily meteorological stations across Europe. From these 368 stations, 192 stations with at least 50 years of record with less than 10 % missing data per year over the period 1948–2009 were selected for further analyses. Years with more than 10 % missing data are entirely replaced by 'NA', representing missing values. Figure shows the location and altitude of the 192 stations. Station altitude ranges from sea level to approximately 1000 m a.s.l., i.e., none of the stations lie at the higher altitudes in the mountainous regions. All the stations above 500 m a.s.l., however, are found in the central southern inland region adjacent to zones of higher altitude. The network is denser in southern Norway, particularly along the coast, reflecting the higher population densities in this zone. This implies that southern Norway will have more weight in the model evaluation but we view this as preferable to deleting a number of stations to create a more spatially uniform network density. The mean number of observed years is 56 (maximum 62, minimum 50).
Left: location and altitude (m a.s.l.) of the stations. Right: histogram of altitude (m a.s.l.).
[Figure omitted. See PDF]
As already stated in Sect. , the main topic of this study is the evaluation of MEWP, the rainfall probabilistic model used in SCHADEX. SCHADEX aims to describe the distribution of floods by a stochastic simulation process which combines heavy rainfall events and catchment saturation states, including simulated snowmelt. In SCHADEX, heavy rainfall events are considered as 3-day centered precipitation events, being composed of a central rainfall and two adjacent rainfalls which are lower than the central one . The value for central rainfall is simulated using a fitted MEWP distribution for the extreme rainfall , and the 2 adjacent days are simulated conditionally, using contingency tables to account for the dependence of the magnitude of the rainfall on the day before and after the peak rainfall. Given that MEWP is a probabilistic model for heavy “central” rainfall, rather than for all daily rainfall values, a pre-processing of the data was required to select the central rainfall values exceeding the precipitation received on both the preceding and following days by 1 mm or more at each station. By doing this we obviously reduce the number of data available for analysis. In Norway about one-quarter of the days of record represent central rainfall values, and this is, on average, about one-half of the days with precipitation. However one advantage of this pre-processing is that central rainfall values at a given location can be expected to be independent since they are always separated by at least 1 day. For extreme values, this independence can be quantitatively assessed by computing the so-called extremal coefficients for the daily and central samples and comparing their respective values for each station. Extremal coefficients lie between and and the closer to , the less dependent the extremes. The inverse of the extremal coefficient can be more easily interpreted as the mean size of clusters at extreme level, i.e., roughly speaking, the mean number of consecutive values that are extreme. Using the estimation method of with a threshold equal to the 90 % quantile of daily rainfall, we find that extremal coefficients for daily rainfall are about 0.6, whereas those for central rainfall are about 0.8 (representing a mean cluster size of about days). The central rainfall values can therefore be considered to be close to the case of complete independence.
Model and method
Modeling
Exponential and GPD models
Let be the random variable of central rainfall at some location in Norway. We are interested in the distribution of extreme values, i.e., of when is large. Let us consider a (high) level and write the -quantile of , i.e., such that . Then, for all exceeding , we have the decomposition Extreme value theory (EVT) ensures that if the central rainfall values are independent and identically distributed and for large enough , can be approximated by the distribution for all , provided in Eq. (2) that if . Parameter in Eq. (2) is independent of ; this is the shape parameter which models the heaviness of the tail of the distribution. Parameter in Eqs. (2) and (3) depends upon and is called the scale parameter. Equations (2) and (3) imply that excesses follow the generalized Pareto distribution (GPD) in Eq. (2) and the exponential distribution (EXP) with rate in Eq. (3). Models (Eqs. 2 and 3) have been widely used worldwide for modeling rainfall extremes. A good review is provided in the introduction of . Equations (2) and (3) combined with Eq. () give the approximation of the distribution of for all : where .
MEWP and MGPWP models
In the previous section, we implicitly assumed that central rainfall, , is identically distributed throughout the year. This assumption may be questioned. Indeed, different climatological processes trigger precipitation, leading to the occurrence of rainfall of different natures and intensities (e.g., convective vs. stratiform precipitation). Furthermore, rainfall occurrence and intensities often vary with season, reflecting both variations in temperature and in storm tracks, for example. For this reason, proposed the use of subsampling based on seasons and weather patterns (WP). Each day of the record period is assigned to a WP. If seasons and WP are considered, then days are classified into subclasses. The law of total probability gives, for all , where is the probability that a given day is in season and in WP (thus ). The central rainfall values occurring in season and WP can be assumed to be identically distributed . Thus the extreme value theory described in Sect. can be applied to . Let us consider a high level (taken for simplicity constant for all ) and , the quantile of . Application of Eq. () to gives the approximation for , where is given by Eqs. (2) and (3), where , and are respectively replaced by , and . Thus, Eqs. () and () give, for all , the approximation of the distribution of : MEWP and MGPWP (Multi Generalized Pareto Weather Pattern) models are both defined by Eq. () with different choices of : in MEWP all are set to – in which case is the EXP distribution – while in MGPWP, is free to vary in the positive range. We exclude cases because they give bounded GPD distributions with an upper bound at , which is usually unrealistically low for rainfall. Using the GPD with in Eq. () allows models with heavier tails than with the EXP distribution, which is light-tailed. Theoretically, other heavy-tailed distribution could be used for in Eq. () but the GPD is justified by EVT and it provides a natural generalization of MEWP by allowing the to vary freely (within the positive range). Both models, MEWP and MGPWP, will be evaluated on the Norwegian data. To keep track of the level and of the fact that seasons and WP are used in Eq. (), we will respectively write these two models as MEWP and MGPWP. Likewise, we write EXP and GPD to represent the basic cases of Eq. () when neither season nor WP are considered, corresponding to cases MEWP and MGPWP.
Model estimation
Use of the EXP, GPD, MEWP and MGPWP models requires the choice of high enough thresholds such that EVT can be applied. Selection of an adequate threshold gives rise to a bias-variance tradeoff: the higher the threshold, the better the approximation of the tail of (smaller bias), but at the same time, the higher the variance of the estimated parameters because a smaller number of exceedances are available. Graphical tools for threshold selection, such as mean residual life plots , are usually difficult to interpret in practice. Therefore, the common practice is to fix a high enough level and to set thresholds to the empirical quantile of rainfall occurring in season and WP .
Given (and therefore ), the parameters that must be estimated for the EXP and GPD models (Eq. ) are those of in Eqs. (2) and (3). Estimation is made by the method of L-moments : where and are the sample L-moments of order and for the central rainfall exceeding , which are independent; see Sect. . In the GPD case, if , then is imposed (i.e., the EXP distribution) to exclude bounded distributions. It should be noted that the choice of the L-moments method only affects the GPD case since for the EXP case, the commonly used L-moments, moments and maximum likelihood estimators coincide. For the GPD case, a separate analysis (not shown) reveals that the choice of the estimation method does not actually affect the regional evaluation very much because slight differences in estimation that occur at the local scale are smoothed out at the regional scale.
Parameters and in of Eq. () for MEWP and MGPWP are estimated likewise by the L-moments method, using the observed central rainfall of season and WP exceeding . Probability is estimated as the empirical proportion of days in season and WP . Estimation of is then obtained for all with Eq. ().
Computation of return levels
The -year return level is the level expected to be exceeded on average once every years. It satisfies the relationship , where is the mean number of central rainfall events per year. When is EXP or GPD, estimation of is obtained explicitly as where and are the parameter estimates of of Sect. . For the MEWP and MGPWP models, there is not an explicit formulation for and it is obtained numerically by solving in Eq. (). Equation shows that in GPD model, is mainly influenced by the value of . For the MGPWP model, practice shows that for reasonable to large (typically years), is mainly influenced by the largest .
Model evaluation
The goal of this evaluation is to assess which model performs better at the regional scale, i.e., for a set of stations taken as a whole, rather than individually. We follow the split sample evaluation proposed in and . We divide the data for each station into two subsamples, C and C, and fit a given competing model on each of the subsamples, giving two estimated distributions , estimated on C, and , estimated on C. Our goal is to test the consistency between validation data and predictions of the estimates, and the accuracy and stability of the estimates when calibration data change. For this, three scores are computed, assessing respectively stability (SPAN) and reliability (AREA(FF) and AREA) of the fits. These scores were proposed and used in and .
The SPAN criterion evaluates the stability of the return level estimation, when using data for each of the two subsamples. More precisely, for a given return period and station , where , e.g., is the -year return level for the distribution (see Sect. ) estimated on subsample C of station . is the relative absolute difference in -year return levels estimated on the two subsamples. It ranges between and ; the closer to , the more stable the estimations for station . For the set of stations, we obtain a vector of SPAN of length with a distribution which should remain reasonably close to zero. A rough summary of this information is obtained by computing the mean of the values of SPAN, : For competing models, the closer the mean is to , the more stable the model is.
Graphical tools for model evaluation based on FF scores, for three simulated series of length 200. The CDF case (upper left) is the method of . The density case (upper right and lower panels) is the alternative method comparing densities (Eq. ). The dotted horizontal lines show the 95 % confidence interval for uniform variates on of length 200, based on 1000 simulations.
[Figure omitted. See PDF]
The FF criterion is used to estimate the reliability in estimating the probability of occurrence of the maximum of independent variables. Let be a set of independent and identically distributed rainfall values with distribution and . Then and, thus, the distribution of is . Therefore follows the uniform distribution on . Now write and , where the estimation of for station is obtained respectively for subsamples C and C. If and are good estimations of , then and should approximately follow the uniform distribution, . Now let (resp. ) be the number of central (thus independent) rainfall values in subsamples C (resp. C) and (resp. ) the corresponding observed maximum, then should both be realizations of the uniform distribution. For the set of stations, this gives two uniform samples and of size each. Hypothesis testing for assessing if the uniform assumption is valid is challenging because the are not independent from site to site, due to the spatial dependence between data. Thus proposed to base comparison on the graphical analysis of cumulative distribution functions (CDFs), by inspecting how much the CDF of the ff diverge from the line, corresponding to the CDF of uniform variates on . A quantitative assessment of this divergence is provided by computing the area between both CDFs. However, we find such evaluation confusing because the value of the area depends on where, between 0 and 1, the divergence is located. An illustration of this is given in Fig. for three simulated series of length 200 (which is about the number of stations). In case 0, the ff are all drawn from (reference case). In cases 1 and 2, 80 % of the ff are drawn from and 20 % are drawn from in case 1 and from in case 2. Departure of ff from the uniform case is sometimes not easy to interpret. However case 1 corresponds usually to a tendency towards an overestimation of the largest observation, while case 2 corresponds to a tendency towards overfitting the largest observation. In the CDF plot (upper left), the area value is as expected the lowest for case 0. However case 2 gives surprisingly also a very good score, whereas that of case 1 is 3 times as large. Therefore these criteria would falsely indicate a better performance (i.e., smaller area value) of case 2 (overfitting) as compared to case 1 (overestimation), although they both contain 20 % data diverging from the uniform on . As an alternative, we prefer to base evaluation on divergence between densities rather than CDFs. A reasonable estimate of this latter is obtained by computing the empirical histogram of the ff with 10 equal bins between and , and comparing it with the uniform density between 0 and 1 (which equals 1). For a more quantitative assessment, we compute the area between both densities as follows: where denotes the cardinality of the set. The term inside the absolute value in Eq. () is the difference between densities in the th bin. The division by forces the score to lie in the range with lower values indicating better fits (the worst case being all values lying in the same bin). Illustration of this computation is shown in Fig. on the aforementioned simulated data (upper right and lower panels). The score for case 0 is again the lowest, however the value is larger than when comparing CDFs due to the discretization into bins. As expected, the criteria now give similar scores for cases 1 and 2, unlike the method based on CDFs. This leads us to base comparison on the new AREA score (Eq. ), giving preference to lower scores but keeping in mind that a score of 0.1 is already a good score since this is the mean AREA value we obtain when simulating uniforms on . Returning to ff values of cross-validation, and , this gives us two scores of model evaluation, namely AREA and AREA.
The criterion assesses reliability of the fit, as FF, but focuses on prescribed quantiles rather than on the overall maximum. Let be a set of independent and identically distributed rainfall values with distribution , and let be the random variable equal to the number of exceedances of the -year return level, i.e., , where is the mean number of observations per year. Since every event occurs with probability , follows a binomial distribution with parameters . Let be the corresponding cumulative distribution function, i.e., such that , and . Because is not continuous, the probability-transformed indices are not uniform. Thus, propose to consider the random variable such that and show that is uniform on . Now, consider the estimates and for a given station and where is the mean number of central rainfall events per year at station . If and are exact estimates for , then (resp. ) should be realizations of a binomial with parameters (resp. ) and . Let and be the corresponding binomial cumulative distribution functions and let , , be uniform simulations between and . Then are realizations of the uniform distribution . For ranging over the set of stations, we thus obtain two vectors of size of uniform samples, so that we can write , . Scores are calculated as for FF by comparing the empirical densities of , to the theoretical uniform density, giving the two scores AREA.
Application of MEWP and MGPWP in Norway
Models considered
We wish to evaluate and compare the performance of EXP, GPD, MEWP and MGPWP for estimating central rainfall values across Norway. To apply the split sample procedure described in Sect. for each station , we randomly divide years into two subsamples such that 50 % of the observed years are in sample C and the remaining 50 % are in sample C. This split sample procedure is applied to each station independently (meaning that years of C and are very unlikely to all be equal for ). This creates two new data sets, each comprising 192 stations with a maximum of 31 years of observations.
As is always the case for extreme value analysis, threshold choice is uncertain. We therefore considered a large set of thresholds with between and . The evaluation scores are then used to select both the best model and the best threshold(s). Choice of as low as may at first glance appear to be very low for studying extremes, but one has to remember that the data series are already preprocessed to include only central rainfall values. Days with central rainfall will tend to have higher intensities than a randomly selected day with rainfall, as by construction, the central rainfall series excludes the previous and following days with lower rainfall intensities (see Sect. ). A threshold level of corresponds actually to a level of about for the daily (non-zero) rainfall values.
The estimation scheme can be summarized as follows. For each of the considered values, we fit six models with the exponential distribution:
EXP, which is a particular case of MEWP, where is one season and is one weather pattern;
MEWP, i.e., a combination of WP distributions, where or (see below);
MEWP, i.e., a combination of two seasonal distributions. Choice of the seasons is explained below;
MEWP, i.e., a combination of seasonal and WP distributions, with or ;
For the cases involving the use of WP, we employ the weather-type (WT) classification described in , following the “bottom-up” method presented in . Details of this scheme are also reported in and can be briefly summarized as follows: ascending hierarchical classification is first performed on the rain fields for days with rain, as described by 175 stations in Norway and the surrounding region. The average synoptic pattern (WT) associated with each rain-field class is then identified from an atmospheric pressure data set constructed from geopotential height data centered over Norway. Finally, every day of the period considered (1948–2009) is assigned to a WT using the proximity of its geopotential height data to one described by a WT. In the first instance , eight distinct WTs were defined, seven corresponding to days with rain and one representing dry days. For the first application of SCHADEX in Norway , a grouping of the eight weather types into four weather patterns (WP) was made to improve the robustness of the MEWP models (Fig. ) by increasing the number of values in the subsamples. In this paper we, however, use the term “weather patterns” (WP) to refer to both sets of classifications, i.e., having four or eight classes; and both the use of the full set of eight classes or the grouped set of four classes are evaluated.
Weather pattern classification with four classes (denoted WT1 to WT4 above) and eight classes (WP1 to WP8 above). This is Fig. 5 of . Case with four classes is obtained by combining the eight classes into four. The last class of each classification (respectively WT4 and WP8) represent dry days.
[Figure omitted. See PDF]
In cases where subsampling is also undertaken by season, we impose a restriction of being two seasons, representing the season-at-risk and the season-not-at-risk. Furthermore, we impose the season-at-risk to be composed of 2 to 4 consecutive months (the remaining months falling in the season-not-at-risk). The optimum choice of the months composing the season-at-risk is made following the procedure of , which is applied to each station and model separately, using the whole series (i.e., without splitting into C or C). The principle is to find the season-at-risk for which the estimated model fits at best the months with the highest risk (of extreme rainfall intensities). In detail, the procedure is as follows:
Step 1: compute the 12 mean monthly maxima of central rainfall.
Step 2: set .
Step 3: compute the mean of these values over moving windows of size months.
Step 4: select the consecutive months corresponding to the highest of these values. These months define the season-at-risk. The remaining months define the season-not-at-risk.
Step 5: fit the considered model (e.g., MEWP) with this seasonal definition.
Step 6: compare the monthly fits to the monthly empirical distributions. This comparison is made with the KGE score (Kling–Gupta efficiency, ), which is computed for a given month, , as
where and are respectively the empirical and fitted distributions for month . It should be noted that the KGE criterion is not the only score which could be used here, and was not necessarily developed for scoring distributions. However, the final result (i.e., the seasonal split selected) is not particularly sensitive to the score used.
Step 7: compute a global KGE score as a weighted mean of these 12 KGE scores, with weights proportional to the mean monthly maxima, in order to force the model to have the best fits for the months with the highest risk.
Step 8: set and apply steps 3 to 7.
Step 9: set and apply steps 3 to 7.
Step 10: compare the three global KGE scores obtained respectively for . Select the seasonal definition corresponding to the lowest of these scores.
Length of the season-at-risk (shapes) and first month of the season (color code in the inset) for each station, with model MEWP. The local definition of seasons is used in Sect. , while the regional definition, with four regions, is used in Sect. .
[Figure omitted. See PDF]
Results
Model evaluation and selection
The SPAN, FF and scores presented in Sect. are used to assess the quality of the estimations. We use the three scores because they give complementary answers. Taken together, they allow a global evaluation of both the reliability and the stability of the fits. Different return periods are also considered for SPAN and in order to evaluate different parts of the tail of the distribution. With large we assess the very tail of the distribution while with small we assess the bulk of the distribution.
Scores are reported in Figs. and for the 12 models, using threshold values equal to the -, - and -quantile of the central rainfalls. Keep in mind that all scores lie in the range and the closer to , the better the score. For each model and threshold, we depict three MEAN(SPAN scores for and years, the value of AREA() and the three AREA() values for and years. Values of AREA() and AREA() are not shown as they are very similar. For the SPAN scores, it may seem highly questionable to extrapolate return levels up to years given that estimation is based on about 30 years of data. This is actually the level required by engineering practices and regulatory rules (if not higher) in many countries for risk assessment associated with dam safety. For example, in France - or even 10 000-year return periods are used to design dam spillways , and the -year return period is also used as the design flood level for the higher risk classes of dams in Norway, whilst the probable maximum flood is used to assess the safety of these dams with respect to the potential for dam failure .
Scores of evaluation for MEWP models, for , and . Better scores have values closer to . Scores of SPAN, for and -year return periods, are the mean scores of Eq. (), while scores of FF and , and years, are based on the density areas (Eq. ).
[Figure omitted. See PDF]
Same as Fig. for MGPWP models.
[Figure omitted. See PDF]
Comparison of the 100-year and 1000-year return levels (in mm) estimated on C and C, for the four MEWP models (in red) and the four MGPWP models (in black), with a level (one point per station).
[Figure omitted. See PDF]
Left: estimated s on C and C for the four MGPWP models, with (one point per station). MEWP models correspond to (red points). Right: same s as a function of the sample size with WP (black points) and without WP (white points) (one point per station and period).
[Figure omitted. See PDF]
Figure shows that for the exponential models, there is a clear benefit obtained from the use of seasonal splitting (case EXP vs. ) and WP splitting (case EXP vs. and ), and the combination of both seasonal and WP splitting performs even better (see cases and ). Indeed, subsampling by season and WP creates groups of rainfall values that are more likely to be identically distributed and therefore more easily fitted than groups of rainfall values derived from different parent populations. Using eight rather that four WPs also slightly improves the scores, but the improvement is somewhat marginal when compared with the gain derived from sampling by season and WP.
Figure surprisingly shows that for MEWP distributions, scores of improve when increases, meaning that the bulk of the distribution is actually less well fitted than the tail. This may be due to the lack of flexibility of the exponential distribution. Using the more flexible GPD distribution (in the GPD and MGPWP models of Fig. ) indeed tends to improve and . However, it clearly also degrades the FF scores. Keep in mind that FF is based on the maximum observed value (see Sect. ) and, thus, permits an assessment of the quality of the fit of the very tail of the distribution. Therefore, although the bulk of the distribution tends to be better fitted with MGPWP distributions ( and ), the very tail (FF) is overfitted, usually giving poorer FF scores.
Figure also shows a clear loss in stability (indicated by the SPAN scores) when using the MGPWP distribution. Figure illustrates this issue by comparing the 100-year and 1000-year return levels estimated on C and C with the four MEWP models and the four MGPWP models, with a level . This shows a difference of up to 100 mm day with MGPWP models for the 100-year return level and up to 300 mm day for the 1000 year-return level, whereas the MEWP models are much more stable. This lack of robustness is due to the difficulty in estimating the shape parameter of the GPD distribution, which has a large influence on the extrapolation to long return periods (see also page 528 of or the upper right of page 350 of ). Figure , on the left hand side, compares the values of estimated on C and C by all MGPWP models. Values between and are mainly found, but differences between the two estimates vary in a similar range. Positive values, even when not very large (typically ) lead to unrealistic return levels at extrapolation, with e.g., up to 600 mm day for the -year return level in the MGPWP case versus 270 mm day in the MEWP case (see Fig. ). Figure , right, shows that estimates of based on fewer than 1000 observations are highly variable. Similar variability in the shape of the GPD is found in for a worldwide data set. Cases with fewer than 1000 observations occur more often when WP are considered, due to the additional subsampling which produces smaller data sets. However, the SPAN values of Fig. show that even for the GPD and MGPWP with , robustness is very poor. This lack of robustness is an important limitation of their value and suitability for practical applications.
Regarding the choice of threshold, MEWP distributions give relatively stable scores for between 0.5 and 0.7 (see Fig. ) but there is a loss in stability as increases over (see green curves of SPAN scores in Fig. ). For MEWP, which gives the best scores overall, the case usually seems to be slightly better. Therefore we select the model MEWP for further consideration.
It is interesting at this point to compare large return levels obtained with the selected MEWP with those obtained for the other MEWP models with the same . Figure makes this comparison for the 100-year return levels. It appears that the other MEWP models tend to give lower return levels (i.e., positive values of the difference). This underestimation is more marked for the EXP model (mean underestimation of about 5 mm of the 100-year return level), and decreases when seasons (MEWP) and WP (MEWP) are used. Therefore, the use of more WPs helps to better model the heaviness of the tail.
Box plot of the difference (in mm) between the 100-year return levels of MEWP and the three other EXP-based models, for (one point per station and period).
[Figure omitted. See PDF]
Scores of evaluation for the local and regional definition of the seasons. Better scores have values closer to . Scores of SPAN, for and years, are the mean scores of (Eq. ), while scores of FF and , and years, are based on the density areas (Eq. ).
SPAN | SPAN | SPAN | |||||
---|---|---|---|---|---|---|---|
Local seasons | 0.058 | 0.070 | 0.085 | 0.076 | 0.209 | 0.163 | 0.130 |
Regional seasons | 0.053 | 0.062 | 0.074 | 0.080 | 0.202 | 0.185 | 0.158 |
Divergence in density between and the uniform case, under random sampling (left) and temporal sampling (right), with corresponding scores AREA. The closer the bars are to 0, the better the fit is. The dotted horizontal lines show 95 % confidence interval for uniform variates.
[Figure omitted. See PDF]
Use of regional seasons
We have already mentioned in Sect. that the local definition of the seasons displays a regional pattern, with a season-at-risk in late summer in the two eastern regions and in fall in the two western ones, as illustrated in Fig. . We test here the use of this regional definition of the seasons by fitting new MEWP models and comparing the overall scores to those of the local definition of Sect. . As shown in Table , scores of the two definitions are fairly similar, particularly in light of the differences obtained between the models of Fig. . Robustness (SPAN) is slightly improved with the regional definition. However the fact that scores of both FF and are slightly better (i.e., smaller) when seasons are defined locally gives evidence of a better fit of the very tail with the local definition, and therefore probably a better extrapolation of return levels. Therefore, if one would want to select one and only one definition, we would be tempted to recommend the local one. However, if using MEWP at ungauged sites is of interest, the regional definition of the seasons of Fig. provides a reasonable alternative.
Evidence of trend
The split sample procedure can be used to give insight about potential change in extreme rainfall in Norway over the period represented by the rainfall time series. For this we split the observed years of each station into two subsamples: C contains all years between 1948 and 1978 and C contains all the remaining years, between 1979 and 2009. So, in contrast with the previous analysis, all stations are assigned the same C and C and these are temporal instead of being random. Remember that assesses how well the maximum of C is fitted by the distribution estimated on C, namely (see Eq. ). Therefore a parallel comparison of the density of the values of , for , for this temporal sampling compared to the random one of Sect. can give insight into increases or decreases in extreme rainfall in Norway between the two periods. The density of these values is shown in Fig. .
Box plot of the difference in 100-year return level estimated for C and C with MEWP under random sampling (left) and temporal sampling (right) (one point per station).
[Figure omitted. See PDF]
Left: map of 100-year return level estimated on C (1979–2009) with MEWP. Right: difference in 100-year estimated on C and C.
[Figure omitted. See PDF]
We see that tends to have too many small values with respect to the uniform density under the temporal sampling, whereas it was fairly uniform under the random sampling of Sect. (a complementary analysis, not shown, revealed that very similar densities are obtained with other random splitting approaches). We conclude that tends to overestimate the probability of occurrence of the maximum of C under the temporal sampling. Broadly speaking this means that the maximum of C tends to be too small with respect to that of C. This indicates that extremes during the second-half of the observed period (1979–2009) tend to be higher than those of the first half (1948–1978). This is confirmed by a comparison of return levels obtained on both periods, as shown in Fig. . For the random sampling case, return levels are almost equal for C and C whereas in the temporal sampling case, 100-year return level is about 5 mm higher in C, with 10 % of the stations showing an increase higher than 10 mm (vs. 3 % in the random case). As shown in Fig. , these 10 % stations lie mainly in the southwestern region, between Bergen and Stavanger, which is one of the most rainy areas in Norway, with 100-year return levels higher than 100 mm (Fig. , left). This brief analysis gives evidence for an increase in extreme rainfall intensities which may already be evident in observations for the southwestern region in Norway. This evaluation does not take the place of a full, detailed trend analysis per se, but rather should be taken as a motivation for such an analysis of trends. Our evaluation relies in particular on a somewhat arbitrary splitting of the years in the middle of the observation period. Assessment of possible trends, including when such trends started and their consistency over time is beyond the scope of this paper, but may be of interest in future studies.
Conclusions
This article evaluates a compound model based on weather pattern classification, seasonal splitting and exponential distributions, the so-called MEWP model, for its suitability for use in Norway. The MEWP model is the rainfall probabilistic model used within the SCHADEX method, which is currently being tested in Norway as an alternative simulation method for flood estimation. We show in particular the benefit gained by subsampling the heavy rainfall data according to season and weather pattern. Our results also indicate that models based on the exponential distribution perform better than those based on the more flexible generalized Pareto distribution, which tends to overfit the data and lacks robustness. We have also demonstrated that a regional definition of seasons in MEWP is possible. Finally, we give evidence for an increase in extreme rainfall intensities in Norway in recent years, particularly in the southwestern region.
Our analysis has also shown that the GPD distribution better models the bulk of the distribution of extremes, but fails to robustly estimate the tail, and therefore fails in extrapolation to large return levels. The reason for this failure is twofold: firstly, the lack of data for estimating such a flexible distribution when using a local approach; secondly, the inherent nature of the GPD, which is a heavy-tailed distribution when the shape parameter is positive, and can therefore tend to give unrealistic return levels for very long return periods. To address this issue, a regional approach allowing the use of neighboring stations to infer MEWP distributions at local sites is of interest. Finally, there are also other, more flexible, distributions which may be more robust than the GPD distribution and could be used within the MEWP approach. This also represents an important topic for future work.
This study is the first extensive evaluation of MEWP in Norway. It has also been applied successfully in France , Austria and West Canada . MEWP is a general model imposing no specific hypotheses on the data, so its application in other regions of the world is absolutely worth considering. The only limitation is that a classification into weather patterns suitable for evaluating extreme precipitation is needed as a precursor to such an analysis, but this is already available in several regions around the world (see )
Acknowledgements
Deborah Lawrence acknowledges support from the project “FLOMQ – Robust framework for estimating extreme floods in Norway” supported by the ENERGIX program of the Norwegian Research Council, as well as internal research funds from NVE which have made the collaboration with LTHE possible. The authors acknowledge support from the COST Action ES0901 FloodFreq which has made the collaboration between NVE and EDF possible, including a Short Term Scientific Mission for Anne Fleig to EDF. Edited by: P. Tarolli Reviewed by: two anonymous referees
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2015. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Simulation methods for design flood analyses require estimates of extreme precipitation for simulating maximum discharges. This article evaluates the multi-exponential weather pattern (MEWP) model, a compound model based on weather pattern classification, seasonal splitting and exponential distributions, for its suitability for use in Norway. The MEWP model is the probabilistic rainfall model used in the SCHADEX method for extreme flood estimation. Regional scores of evaluation are used in a split sample framework to compare the MEWP distribution with more general heavy-tailed distributions, in this case the Multi Generalized Pareto Weather Pattern (MGPWP) distribution. The analysis shows the clear benefit obtained from seasonal and weather pattern-based subsampling for extreme value estimation. The MEWP distribution is found to have an overall better performance as compared with the MGPWP, which tends to overfit the data and lacks robustness. Finally, we take advantage of the split sample framework to present evidence for an increase in extreme rainfall in the southwestern part of Norway during the period 1979–2009, relative to 1948–1978.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Univ. Grenoble Alpes, LTHE, 38000 Grenoble, France; CNRS, LTHE, 38000 Grenoble, France
2 Norwegian Water Resources and Energy Directorate (NVE), P.O. Box 5091, Majorstua, Oslo, Norway
3 EDF – DTG, 21 Avenue de l'Europe, BP 41, 38040 Grenoble CEDEX 9, France