1 Introduction
Tornadoes are rapidly rotating columns of air , extending vertically from the surface to the base of a cumuliform cloud, and represent one of the most severe weather phenomena in terms of victims and damage. Considering only the USA, every year about 500 tornadoes of intensity EF1 (enhanced Fujita scale; ) or stronger occur, producing an average of 125 victims and huge devastation . Numerical simulations of the very fine spatial and temporal scale of tornadoes (typically with a diameter of less than 2 km and a duration of less than 1000 s) require resolutions that are orders of magnitude smaller than those currently available in operational weather prediction and climate models . Further, the chaotic dynamics of these vortices limit their deterministic prediction . Consequently, climatological studies focused on the identification of the environmental conditions favourable to tornado-spawning severe convective storms. Several thermodynamic and kinematic meteorological parameters have been analysed, either individually or considering combined instability indices, to identify the conditions most favourable to the genesis of tornadoes . This approach is consistent with the basic idea that tornadoes result from a multi-stage process, which takes into account that the tilting of the horizontal vorticity near the ground by a violent updraught plays a basic role . Such a conceptual model is used here as a background framework for introducing an analytical formula for the probability of tornado occurrence. A previous study defined a tornado index limited to the USA based on a Poisson regression between the observed U.S. climatology of tornadoes and monthly averaged environmental parameters from reanalysis . Other studies limited their conclusions to the identification of the conditions that are associated with mesoscale convective hazards . The expression that we propose in this study is meant to provide a tool for supporting tornado warning in operational weather predictions and estimating changes in the frequency of tornado occurrence in climate projections.
2 Data and methods
Our analysis is based on tornadoes that occurred in the USA (dataset provided by the Storm Prediction Center (SPC),
The univariate analysis of the (conditional) probability of tornado occurrence is carried out by partitioning the observed range spanned by each variable into 17 equiprobable sub-intervals (bins). Such a number has been chosen as a compromise between the need of a number of bins sufficient for robust regressions and of a number of observations in each bin sufficient for a robust statistical analysis. An empirical estimate of the probability of tornado occurrence, conditional to the fact that the value of the variable lies in a given bin, is computed as the relative frequency of tornadoes in the bin. Its uncertainty is estimated via a suitable bootstrap (Monte Carlo) procedure. An analytical expression of is found by a simple linear regression for WS, SRH, and LCL, as well as by a non-linear regression for WMAX (see the Supplement). Notice that first the climatology of the variable of interest is calculated via the partition mentioned above, and then it is compared with the tornadic cases (an approach similar to the one adopted in ).
3 Results
The univariate analysis shows that all the four variables considered in our study (i.e. WMAX, WS, SRH, LCL) are significantly linked to the formation of tornadoes. However, the formulas involving WS and WMAX, i.e. describe a range of probabilities (from to ) wider than that spanned by SRH and LCL. In the case of WS, the probability increases exponentially over the whole range. Instead, the behaviour of as a function of WMAX is non-linear and shows a hyper-exponential increase in for low values (WMAX m s), when the probability is small (about ); in the intermediate range the growth gradually slows down, and becomes quasi-constant for large values (WMAX m s), where the probability tends to . For LCL and SRH, the exponential decrease and increase, respectively, only describe a narrow range of probability (approximately from to ). In other words, variations of these two variables do not allow us to discriminate between the low and high probability of occurrence of tornadoes as effectively as in the case of WS and WMAX (see Fig. ).
Figure 1
Univariate probability distribution for WMAX, WS, SRH, and LCL. Markers and whiskers denote the empirical probabilities with uncertainty range. Lines denote the empirical estimates (continuous) with uncertainty ranges (dashed). Different colours represent values based on the full dataset (USA and EU, black), the USA data only (red), and the European data only (EU, blue). Uncertainty ranges correspond to a 95 % confidence level.
[Figure omitted. See PDF]
Concerning the bivariate analysis (i.e. considering the joint behaviour of pairs of predictors), in analogy with the univariate case, a grid matrix is constructed to partition the whole two-dimensional domain in cells. The empirical estimate of the (conditional) probability of tornado occurrence, provided that the pair of variables lie in a given cell, is empirically computed as above via the estimate of the relative frequency of occurrence. Six different bivariate analyses are carried out considering all possible pair combinations of WMAX, WS, SRH, and LCL. For the bivariate probability, non-linear expressions have been adopted for all the pairs of variables involving WMAX and a multiple linear expression for the remaining pairs (see the Supplement). The values of the parameters of the bivariate probability functions have been estimated by a regression of the proposed expressions over the empirical probabilities.
Considering the bivariate expression of as a function of the pairs (WMAX, LCL) and (WS, SRH), the second variable lacks significance, meaning that it provides information analogous to the first one of the pair (in fact, they are fairly correlated), but the first variable provides more (univariate) informative details than the second one in terms of the range of . Considering the pairs (WMAX, SRH), (WS, LCL), and (SRH, LCL), the probability of tornadoes significantly depends on both variables, but they describe variation in only over 2–3 orders of magnitude, whereas using the pair (WMAX, WS) shown in Fig. it is possible to discriminate between conditions where the probability ranges from to (see the Supplement for the figures regarding all the other pairs). In conclusion, a valuable fit of the probability of occurrence of tornadoes over the range – is 3
Figure 2
Bivariate probability distribution for . The coloured surface shows the empirical fit of . Upward and downward triangles represent empirical estimates located above and below the fitted surface. All values are reported according to the colour bar.
[Figure omitted. See PDF]
All parameters of the univariate fits in Fig. and bivariate ones in Fig. are statistically significant and significantly different from zero, since the values of the corresponding tests are (much) smaller than 1 %. For all univariate linear regressions, the adjusted is larger than 90 %, and, in general, the goodness of the fits is visually confirmed by the overwhelming fraction (from 90 % to 100 %) of probability values within the 95 % confidence bands. In the bivariate case, considering the multiple linear regressions of the pairs (WS, SRH), (WS, LCL), and (SRH, LCL), is, respectively, 70 %, 72 %, and 54 %: in general, these are smaller than in the single-variable case, but this is justified by the fact that the residual variances are about 3 times larger than those estimated in the univariate case. For the three pairs involving WMAX, cannot be used to assess the goodness-of-fit because the regression is non-linear. However, a slice analysis of the fits (see the Supplement for details) shows that the proposed models provide valuable fits over the whole domain of interest.
4 DiscussionFurther investigations are required to ensure the validity of the expressions in Eqs. (), (), and () in different environmental and geomorphological conditions. Hypothesis-testing the similarity of the populations of tornado probabilities and , obtained using only EU and only USA data, respectively, has been carried out by using a Kolmogorov–Smirnov-like (KS) approach adopting the metric . The significance level of the difference is assessed by computing the fraction of statistics exceeding using a Monte Carlo permutation procedure. Considering the univariate models, the null hypothesis that and , as a function of WMAX and WS, are statistically compatible cannot be rejected at 95 % and 99 % levels (suggesting that Eqs. and are acceptable in different geographical domains), whereas it is rejected at a level larger than 99 % for and as a function of SRH and LCL. Considering the bivariate conditional probabilities, the null hypothesis – that and are statistically compatible – could not be rejected (at a 90 % level) only for the pair (WMAX, SRH). In this case, the overall conditional probability (combining USA and EU data) is 4 For all other pairs the null hypothesis could be rejected at the 99 % level.
Possible explanations of the lack of compatibility between conditional probabilities obtained using the EU and USA datasets alone could be different tornado damage-reporting practices (leading to different counting and attributions of tornadoes to the EF/F scale) and different meteorological and/or morphological conditions in the two domains. In spite of these limitations, as well as the need for further investigations, the proposed statistical models suitably fit the conditional probabilities of tornado occurrence. In particular, Eq. () has the merit of fitting the bulk of all available data and Eqs. (), (), and () of being robust with respect to the considered geographical domains.
The formulas of Eqs. ()–(), and particularly the bivariate expressions of Eqs. () and (), outline a new statistical tool that can be used for diagnosing the likelihood of tornadoes with potential applications to short–medium range weather predictions and future changes in their frequency in climate projections. Former results considered monthly average probability or provided a modest fit to the data and were based on a smaller dataset . The closest analogue to our approach is the formula of tornado probability in , who considered two parameters: one describing vertical changes in temperature and a composite parameter merging CAPE and wind shear. Our results differ from in the adopted methodology for estimating the probability of occurrence of tornadoes. propose a linear regression of the logistic function, whereas we propose a non-linear bivariate fit of the logarithm of the probability. In addition, our study shows that the relationship of CAPE to the probability of tornado occurrence departs significantly from a linear dependence and that the interaction between the action of CAPE and wind shear in the lower troposphere cannot be adequately represented by their additive combination, further expanding the outcomes of . Finally, used their formula for estimating past occurrence rates of tornado occurrences, while, to our best knowledge, this is the first time that analytical expressions in the form of Eqs. () and () are proposed in the scientific literature with the general aim of describing probability of tornadoes at high time and space resolution with applications in weather forecasting and climate projections.
Data availability
The list of tornadoes in the USA can be freely downloaded at
The supplement related to this article is available online at:
Author contributions
RI has been responsible for data collecting, processing, and plotting; PL for the coordination of the study; MMM for the meteorological analysis; and GS for the statistical analysis and the computation of the probability of occurrence of tornadoes. All the authors wrote and contributed to the final manuscript.
Competing interests
At least one of the (co-)authors is a member of the editorial board of Natural Hazards and Earth System Sciences. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.
Disclaimer
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements
The authors gratefully acknowledge useful discussions and suggestions by Fabrizio Durante (University of Salento, Lecce, Italy. The work of Piero Lionello has been carried out with the partial financial support from ICSC – Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union – NextGenerationEU (CUP F83C22000740001). Moreover, we thank the support of the European COST Action CA17109 “DAMOCLES” (Understanding and Modeling Compound Climate and Weather Events) and the support of the Italian PRIN 2017 (Research Projects of National Interest) “Stochastic Models of Complex Systems” (2017JFFHSH). ESSL is acknowledged for providing European data, ECMWF for ERA5 reanalyses, and the Storm Prediction Center for US reports.
Financial support
This work has been partially financially supported by ICSC – Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union – NextGenerationEU (CUP F83C22000740001).
Review statement
This paper was edited by Maria-Carmen Llasat and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
A methodological approach is proposed to provide an analytical (exponential-like) expression for the probability of occurrence of tornadoes as a function of the convective available potential energy and the wind shear (or, alternatively, the storm relative helicity). The resulting expression allows the probability of tornado occurrence to be calculated using variables that are computed by weather prediction and climate models, thus compensating for the lack of resolution needed to resolve these phenomena in numerical simulations.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 Department of Earth and Atmospheric Sciences, University of Quebec in Montréal, 201 av. duPresident Kennedy, Montréal, H3C 3P8, Canada
2 Dipartimento di Scienze e Tecnologie Biologiche ed Ambientali, Università del Salento, via per Monteroni 165, Lecce, 73100, Italy
3 ISAC-CNR, Istituto di Scienze dell'Atmosfera e del Clima, Consiglio Nazionale delle Ricerche, corso Stati Uniti 4, Padua, 35127, Italy
4 Dipartimento di Matematica e Fisica, Università del Salento, Provinciale Lecce-Arnesano, P.O. Box 193, Lecce, 73100, Italy