1 Introduction
Wind speed forecasts have many potential users that could benefit from skilful forecasts in different time scales, ranging from hourly to monthly forecasts. For example, short- and medium-range forecasts of extreme wind speeds are often utilised in early warnings for severe weather
Forecasts for the subseasonal time frame have improved greatly in recent years
This study concentrates on subseasonal wind forecasts in winter, as forecasts for winter in northern Europe are known to be more skilful than forecasts for other seasons
It is well known that, especially with longer lead times, the ensemble forecasts often have a systematic bias and the spread of ensemble members can be too small
This work was a part of the Climate services supporting public activities and safety (CLIPS, 2016–2018) project and we use the data collected during the project.
2 Data and Methods
2.1 Forecasts and reference observations
The forecasts used in this study were extended-range forecasts of 10 m wind speed, provided by the ensemble prediction system (EPS) from the ECMWF (European Centre for Medium-Range Weather Forecasts) . The forecasts were issued twice a week, on Mondays and Thursdays. The horizontal resolution of the reforecasts was 0.4 and the temporal resolution 6 h. Reforecasts have been made for the same dates as operational forecasts for the past 20 years. While forecasts have an ensemble of 51 members (the control run and the perturbed members), reforecasts have only 11 members (the control run and the perturbed members).
Following , we concentrated on winter forecasts and the lead times up to the start of the third week. The CLIPS project covered two winters (2016–2017 and 2017–2018), but two winters of operational forecasts would have been a rather small data set to make meaningful inferences about how operational wind forecasts perform, and we decided to concentrate on the reforecasts. Thus, reforecasts of winters 2016–2017 and 2017–2018 were analysed; the winter months included November, December and January. Starting from the beginning of each reforecast, we compute a weekly forecast every two days and try to determine how long the forecasts remain skilful (as defined in Sect. ). Weekly forecasts are the mean of seven days of forecasts ( time steps). In all, there were about 1000 reforecasts for each lead time, that is, 20 reforecasts for every 50 operational forecasts (two years three months four weeks in a month twice a week). The years 1987–2017 from the ERA-Interim reanalysis were used as observations and as climatological reference forecasts. These climatological forecasts are based on the distribution we get from the weekly values of the different years for the same date. These can then be used either as an ensemble or, after taking the mean, as a deterministic forecast.
The same data cannot be used to both fit and evaluate the performance of the NGR, so we split the reforecasts and ERA-Interim data into two data sets: the training data set of winters starting on odd years and the validation data set of winters starting on even years. The training data set was used to fit the NGR, while the validation data set was used to evaluate the adjusted reforecasts. As the ERA-Interim data set included 31 years, the reference forecasts from ERA-Interim were based on 30 years, omitting the year under study.
We evaluated both the deterministic forecasts and probabilistic forecasts. The forecasts were the weekly means of the wind speed, spatially aggregated for the area shown in Fig. . The area under study was chosen to be rather homogeneous while inside Finnish borders, as the coastal and more mountainous northern areas were mostly not included. Note that it would have been possible to hunt for the longest possible skilful lead time by strategically changing the area shown in Fig. , but we did not pursue this further. The exact area to be forecasted depends on the end-user requirements and is for future studies to determine.
Figure 1
The area used in the spatially averaged forecasts in Finland.
[Figure omitted. See PDF]
The effect of seasonality on the forecast skill was removed by subtracting the first three harmonics of the annual cycle.
2.2 Non-homogeneous Gaussian regressionIn this study, NGR was used to correct the mean weekly forecast, not the forecasts at each time step, as in, e.g., or . This simplifies the modelling somewhat; according to the central limit theorem
NGR provides the Gaussian probability distribution
1 where is the mean, and is the standard deviation. In contrast to the regular Gaussian regression, is not a constant. The mean is 2 where is the ensemble mean and the standard deviation is 3 where is the ensemble standard deviation. The constants , , , and are then fitted (or trained) with the data. The logarithm in Eq. () is used to keep the estimated positive and is not strictly necessary, but note that problems in the numerical optimisation can occur if it is not used.
In this study, deterministic forecasts were the mean parameter of the NGR. The probabilistic forecasts utilise both the mean and the standard deviation parameters, so both the bias and spread of forecasts are adjusted.
2.3 Verification methodsFor the verification terminology, we follow . For the deterministic forecasts, the commonly used measure is the mean squared error (MSE)
4 where is the forecast (here the mean parameter of the NGR) and is the observation (here the weekly mean of ERA-Interim). The MSE can be shown as the function of the mean error (ME, ), the standard deviation of observations and forecasts ( and ), and their Pearson correlation () 5 For the perfect forecast, the MSE should be zero, implying that the ME should be zero, and should be equal, and should be one.
The continuous ranked probability score (CRPS) is used for the probabilistic forecasts 6 where and are the cumulative distribution functions of the forecast and the observation. Scores were calculated using the scoringRules package . The CRPS can also be decomposed, but published methods involve ensemble forecasts, not probability distributions as in this study, so we did not pursue decomposition further.
For a score , with the best possible score being zero, the general form of a skill score (SS) is 7 where is the score of the reference forecast. Here skill scores of the MSE (MSESS) and the CRPS (CRPSS) are used with the reference forecasts being climatological forecasts based on the ERA-Interim. The MSESS based on the climatological reference forecasts is comparable to the coefficient of determination () in linear regression .
Now we can define skilful forecasts as forecasts with a skill score higher than zero. And to be more precise, the forecast is skilful at a statistically significant level, if zero is not within confidence intervals (CIs). As we used both Monday and Thursday forecasts, there is autocorrelation in the data, so the effective number of forecasts is not as high as 1000 for each lead time. This must be taken into account when CIs are calculated. Therefore, the CIs of verification measures are calculated with block-bootstrap
The size of the reforecast ensemble (11 members) is smaller than the size of the climatological ensemble (30 members), and the CRPS values of NGR reforecasts are not readily comparable with the climatological forecasts. Therefore, we used the formula given by to estimate the CRPS as if the NGR ensemble would have had the same number of members as climatological forecasts 8 where is the original size and is the new size.
The quality of probabilistic forecasts was also evaluated using the relative operating characteristic (ROC) curves
3.1 How the NGR changes unadjusted reforecasts
The NGR is not a black box and the change of the constants of Eqs. () and () as the function of the lead time gives insight into the performance of the method. The coefficient , the constant of Eq. (), grows as the lead time increases, while , the coefficient of the ensemble means, decreases (Fig. a). In practice, this means that as the lead time increases the NGR pushes the forecast towards the climatology. In our data set, the range of mean wind observations is roughly from 2 m/s to 5 m/s, the mean being roughly 3.4 m/s. Then, if we calculate fictive forecasts of 2–5 m/s as the function of the lead time (Fig. b), we see that the forecasts larger (smaller) than the mean are increasingly reduced (increased) as the lead time increases, so they tend to the climatological mean, about 3.4 m/s.
Figure 2
The fitted coefficients for NGR as the function of lead time: (a) the mean (Eq. ), and (b) how the ensemble mean of the unadjusted or raw forecast is then modified by the NGR. (c) The standard deviation (Eq. , using only the intercept term) shown along with the ensemble standard deviation of the unadjusted forecasts and the standard deviation of observations (based on ERA-Interim). To help interpreting the standard deviation values, the logarithm was not used in Eq. (). The training set, odd winters during 1997–2017, was used.
[Figure omitted. See PDF]
For the data set here, of Eq. (), the coefficient of the ensemble standard deviations was rather noisy and often not statistically significant, and, without any notable change in the results, only the constant could be used (Fig. c). For all lead times, the NGR standard deviation is slightly larger than the unadjusted ensemble spread, and for longer lead times both tend to the standard deviation of the observations (Fig. c).
3.2 Verification resultsThe ME of both NGR adjusted reforecasts and climatological forecasts is nearly zero (Fig. a), the zero being inside the CIs. This is encouraging, as the ME of climatological forecasts should be zero, and the very small ME of reforecasts suggests that the NGR succeeds in bias correction.
Figure 3
The verification measures of reforecasts (using the validation set, even winters 1996–2016) for the averaged area (Fig. ). Reforecasts are the mean of one week of 6-hourly reforecasts and start every two days. For reforecasts and climatological forecasts, the parts of the MSE decomposition (Eq. ) are: (a) the mean error, (b) correlation, and (c) standard deviation.
[Figure omitted. See PDF]
The correlation of NGR reforecasts decreases as the lead time increases but remains positive at the last lead time calculated (Fig. b). The correlation of climatological forecasts is zero or slightly negative, but the zero remains encouragingly inside the CIs because climatological forecasts should not have any skill. The standard deviation of NGR reforecasts is almost equal to that of the observations in the first lead time (Fig. c), but decreases steadily, even though it remains slightly larger than the standard deviation of climatological forecasts. It is not surprising that the standard deviation of NGR reforecasts decreases from larger values (similar to the observations) to smaller values (similar to climatology), as we have shown how the means of NGR reforecasts tend to climatology (Fig. b).
The MSESS remains statistically significantly positive until the lead time of 21 d, when the CI includes zero (Fig. a). Skilful weekly forecasts cover almost all of the third week. The original CRPSS (with 11 members in the ensemble) remains statistically positive until the lead time of 19 d (Fig. b), while its value is smaller than the MSESS for all lead times. After estimating the CRPS for 30 members by using Eq. () (Fig. c), the CRPSS remains positive for all lead times. This might not be a sensible result. However, the CRPSS values stabilise after the lead time of 21 d. Therefore, not accounting for the smaller ensemble size might give us too pessimistic a result, but using Eq. () might give us a too optimistic a result.
Figure 4
The verification measures of reforecasts (using the validation set, even winters 1996–2016) for the averaged area (Fig. ). Reforecasts are the mean of one week of 6-hourly reforecasts and start every two days. The skill score for (a) the MSE, (b) the CRPS without adjustment, (c) the CRPS adjusted with Eq. (). (d) The area under ROC for the forecasts of mean winds greater than the 50th percentile.
[Figure omitted. See PDF]
The AUROC (Fig. d) remains higher than 0.5 for all lead times considered here. This is not realistic, and it clearly shows that we were not able to remove the effect of seasonality just by subtracting the first three harmonics. The AUROC stabilises around a lead time of 21 d, so the result is consistent with the MSESS and the CRPSS.
4 DiscussionOur results are comparable to those of , who concluded that there is statistically significant skill in predicting weekly mean wind speeds over areas of Europe at lead times of at least 14–20 d. used five years of operational forecasts; their CIs were narrower than ours, and they could make better inference using operational forecasts.
Prior to analysis, we anticipated that the CRPSS (Fig. b and c) would have remained skilful longer than the MSESS (Fig. a), and the CRPSS would have been higher, because the probabilistic forecasts contain more information than deterministic forecasts. While the MSE and the CRPS values cannot be directly compared, deterministic and probabilistic forecasts can be directly compared using the mean absolute error (MAE) of the deterministic forecasts, because the CRPS reduces to the MAE when the forecast is deterministic. Therefore we calculated the MAE (or the CRPS, as they are equal) of the deterministic forecasts and compared it to the CRPS of the probabilistic climate forecasts, and the MAE was smaller than the climatological CRPS (or the CRPSS is greater than zero) only for the first two lead times (not shown). So, not surprisingly, the probabilistic forecasts do contain more information than the deterministic forecasts, but as the skill scores show, deterministic forecasts are better compared with their reference forecasts. How, then, should we interpret this result? Maybe the deterministic reference forecasts could be improved; therefore, perhaps their skill scores presented here are spuriously high and we should concentrate on probabilistic results for a more realistic skill assessment? Or maybe our probabilistic forecasts could be improved, and meanwhile, the deterministic forecasts show what kind of skill can be achieved in the future? A prudent answer is always to choose the less skilful results, and to not be overconfident.
The use of Eq. () should be carefully considered: developed the results for discrete ensemble forecasts, and it is unclear how applicable it is when the NGR is used for forecasts. The smaller ensemble size still makes it harder to estimate and , but is Eq. () an applicable estimator for that?
It is also important to further investigate the impact of the seasonal cycle on the verification results, as an uncritical reading of figures might suggest unrealistic trust in the forecasts.
4.1 The usability of wind speed forecasts
The forecasts might be skilful even for the third week, but the skill is still very low, even if the skill scores are non-zero or positive. For example, an MSESS of around 0.1 can be interpreted as 10 % of the variance explained, which is very little for most applications. So it is not straightforward to see who is the potential user that could benefit from the third-week forecasts. Using the categorisation of users by , we can assume that a casual, low stakes user (“Should I wear a sweater or a short-sleeved shirt?”) might not benefit much from these forecasts, but a user who understands how to use the probabilities in a decision theory framework should be able to utilise the forecasts and benefit from them if they ”play the game” long enough. For the wind forecasts, such a user might be an energy company using renewable energy sources.
In general, the utility of forecasts is defined by the users, so close co-operation and co-development of forecasts with the users is useful, if not essential. Moreover, the mean weekly wind itself might not be useful for most end users. For example, warnings of extreme wind would need percentiles higherthan 50 %
4.2 Future research
It seems reasonable to assume that different reanalyses generate somewhat different climatologies and observations, implying somewhat different skill scores based on different reanalyses. This is especially relevant for a variable such as wind, which is not so straight-forward to measure. So, the use of more than one reanalysis might be useful in future studies. In this study, we used the ERA-Interim as our reference, but more recent reanalyses, such as MERRA-2 and ERA5 (which became available after this project ended), would be natural candidates to be used in further studies.
The bias-adjustment methods used here are only rudimentary and could be improved. For example, note that explicit spatiotemporal statistical models are largely unexplored in subseasonal studies. For weather forecasts, compare NGR with machine learning methods, and show that auxiliary information is needed to improve forecasts. For a subseasonal range in northern America, one source of such information might be indices such as the MJO or ENSO . For Northern Europe, suitable indices might be the QBO
5 Conclusions
We evaluated the weekly mean wind forecasts for Finland based on the ECMWF forecasts. The NGR was used to correct the reforecasts. The skill of forecasts appears to be positive for the third week, but the longest skilful lead time depends on the reference data sets, the scores used, and the correction methods. Also, two winters would have been a rather short time span to make meaningful inferences on how operational wind forecasts perform, so reforecasts with longer time span are essential for comparison. Even then some uncertainty remains. The needs and the competence of the end users determine whether the forecasts are useful or not. The forecasts would be most beneficial for users applying the probabilities in the decision theory framework.
Data availability
The ECMWF reforecasts and ERA-Interim are available for the national meteorological services of ECMWF member and co-operating states and holders of suitable licences.
Author contributions
OH did most of the analysis and wrote most of the article. TKL wrote parts of the Introduction and contributed to the preparation of the paper. OR contributed to the statistical analysis. NK and VA contributed to the preparation of the paper. GH supervised the project.
Competing interests
The author declares that there is no conflict of interest.
Disclaimer
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special issue statement
This article is part of the special issue “Applied Meteorology and Climatology Proceedings 2020: contributions in the pandemic year”.
Acknowledgements
We thank Jussi Ylhäisi for the helpful comments on the early draft of this paper.
Financial support
This research has been supported by the Academy of Finland (grant nos. 303951 and 321890).
Review statement
This paper was edited by Andrea Montani and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The subseasonal forecasts from the ECMWF (European Centre for Medium-Range Weather Forecasts) were used to construct weekly mean wind speed forecasts for the spatially aggregated area in Finland. Reforecasts for the winters (November, December and January) of 2016–2017 and 2017–2018 were analysed. The ERA-Interim reanalysis was used as observations and climatological forecasts. We evaluated two types of forecasts, the deterministic forecasts and the probabilistic forecasts. Non-homogeneous Gaussian regression was used to bias-adjust both types of forecasts. The forecasts proved to be skilful until the third week, but the longest skilful lead time depends on the reference data sets and the verification scores used.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer