Content area
Extreme value theory (EVT) is used as univariate extreme value analysis (EVA) in order to analyze and model the covariates temperature, relative humidity (RH) and the thermal comfort index (humidex) issued from a dataset of 38 years in Tunis. It is a South Mediterranean area known as a hotspot for climate change. The best approach is to reduce the data considerably by taking annual block maxima from mean monthly data. It will converge to a generalized extreme value distribution in order to estimate the return levels of the studied parameters. The stationarity of the series are checked by augmented Dickey‐Fuller test. The modeling of the three parameters shows a Weibull distribution pattern. The extreme/maximum monthly means temperature of 30.2°C and humidex of 39.4 have a common return level between 300 and 350 years. The highest mean monthly RH of 86.0% is expected to be exceeded every 50 years. For the next 38 years, the maxima monthly mean temperatures are expected to be stable, and the maxima monthly mean RH values, as well as the humidex monthly mean maxima are expected to decrease. The percentile air temperature hot day (TX90p) and night (TN90p) indices show globally linear upward trends and the ones of cold days (TX10p) and cold nights (TN10p) have a downward trend. The diurnal yearly temperature range shows an almost flat trend for its evolution through the years of study.
INTRODUCTION
Climate extremes like excessive temperatures or rain could cause droughts or floods that have a very strong impact on human lives. High air temperatures can affect humans either directly by generating health problems or at least a substantial discomfort. Indirectly, they would limit water resources as evaporation accelerates surface water depletion. Worsening this situation is that high temperatures also increase water consumption (Baouab & Cherif, 2015). It decreases agricultural yields thus food availability, that is, in the land, each degree Celsius temperature rise would lead to potential wheat yield reduction by 7.5% (Magrin et al., 2009), in the sea, 1°C warming through the upper ocean would result in the increase of hypoxic areas (areas low in oxygen) by 10 (Deutsch et al., 2015) thus decreasing in fish body size by 20% to 30% per 1°C degree increase (Pauly & Cheung, 2018). Naturally, the increase in air temperature affects all components of the natural ecosystems. Additionally, a humid environment combined with high temperature exacerbates the human discomfort due to heat. This discomfort can be measured by a method called humidex (Masterton & Richardson, 1979).
Some regional studies pointed out that heat waves in Mediterranean now occur more frequently, and that the frequency and intensity of droughts have increased since 1950 (Babaousmail et al., 2022; Christidis et al., 2023; Kelley et al., 2015; Pandžić et al., 2022; Vicente-Serrano et al., 2014). Heat waves are function of the intensity, duration and rate of the increase in temperature (Wahid et al., 2007). It occurs when the temperature rises beyond a threshold level for enough time to cause permanent damage to plants (Lipiec et al., 2013). In any case, the south Mediterranean region historically suffers from major water scarcity, the majority of the countries of this region suffer from absolute scarcity with 0–500 m3 per capita of renewable water resources (FAO, 2011; WWAP, 2015) exacerbated by hot summers and their extreme temperatures. Nevertheless, available data for these regions are very poor as well as local studies, even if all of them are essential to feed the models.
The impacts of climate variability now and in the future are mainly driven by extreme events. It is essential to take them into account during any impact assessment. They are driven the other way around: warming mean climate governs the increased occurrence of monthly high-temperature events whereas changes in monthly heavy-precipitation events depend on trends in climate variability (van der Wiel & Bintanja, 2021). Extreme events are more difficult to model than average climates. They are complex and uncertain because they occur in a chaotic manner. They are rare events by definition, thus in order to characterize their occurrence, large data series and long time scale are needed. Extreme events need to be considered in terms of probabilities or risks of occurrence rather than predictions, adding another source of uncertainty (Smith et al., 2001). The tails of the experimental distributions represent the rare extreme events of a climatic factor. The extreme value theory (EVT) approximates these tails by a theoretical distribution that will help to realize estimations in the future. This can be done either by the analysis of maxima by fixed time intervals, for example, block maxima, using generalized extreme values (GEV) distribution function, or by the analysis of values above a threshold according to the peak over threshold method using generalized pareto distribution (GPD) (García et al., 2002; Lazoglou et al., 2019; Singirankabo et al., 2023). Three parameters can characterize the distribution behavior: location, shape, and scale. They can be estimated by various methods such as maximum likelihood estimation (MLE) that maximizes a likelihood function so that the observed data is most probable under the assumed statistical model, the least square estimation that minimizes the sum of the squared deviations between the observations and the model, the L-moments that are linear combination of order statistics analogous to conventional moments, or the Bayesian Markov chain Monte Carlo. MLE is easy to use and the most widely used estimation (e.g., Zaninović et al., 2008). It is not applicable for small samples of less than 50 data, but it is reliable when used on large datasets (Coles & Dixon, 1999; Nerantzaki & Papalexiou, 2022). Despite that the Mediterranean region is recognized as the primary hot-spot, along with northeastern Europe, for regional climate change (Giorgi, 2006), there are very few studies about the south side of this region for its climatic trends and extremes. This article characterizes extreme temperatures, relative humidity and humidex as well as their return period in a southern Mediterranean city using GEV distribution. It would allow the development of climate change coping and adapting strategies and offer science-based information as well as support decision making mechanisms for policy-makers.
MATERIALS AND METHODS
The stationarity is an important factor in time series studies. The most common statistical tests for stationarity are the augmented Dickey-Fuller (ADF) test and the Phillips-Perron (PP) test. The ADF and the PP tests check the constancy over time of, respectively, the mean and the variance of the time series. These tests have a null hypothesis that the data have a unit root, which means it is not stationary. The alternative hypothesis is that the data is stationary. The tests will give a test statistic, a p-value and critical values at the significance level (5%). The more negative the statistic is, when compared to the critical values the tables, the stronger the evidence against the null hypothesis and the stronger the evidence for the stationarity. A p-value less than the significance level means the null hypothesis is rejected and the data is likely stationary.
EVT provides a mathematical and probabilistic basis for the construction of models to predict the size and frequency of the rare extreme events that would be represented in the tail area of the data distribution of the meteorological data and otherwise not easily predictable. When the observation period (38 years or 456 months) is divided into non-overlapping periods of equal size (1 year or 12 months) and the attention is restricted to the maximum for each period, in EVT this is called the block maxima approach.
If the block maxima is taken from a dataset, here, the monthly means of temperature, relative humidity and humidex, they can converge to a generalized extreme distribution (GEV) (Wilks, 2006).
The known normal distribution is completely defined by two parameters that are the mean and the standard deviation. When the mean of the normal distribution changes, the distribution got higher or lower values, while varying the standard deviation makes the distribution narrower or wider. The normal distribution is symmetrical to its mean, there no parameter that influences its skewness. GEV is defined by three parameters, estimated using method of MLE: the location parameter μ, the scale parameter δ, and shape parameter ξ. In this case, the location and scale parameters behave similarly to the mean and standard deviation in the normal distribution. The shape parameter influences the tails of the distribution so changing the shape impacts the skewness and kurtosis of the data. GEV G(x, μ, δ, ξ) is:
Once the best model chosen, the return level can be evaluated. The return level y of an extreme event is expected to be exceeded, on average, once every T year. The return period T is the mean waiting period, expressed in years, by which the observation y is expected to be exceeded again. It is then possible to calculate the risk ratio that quantifies the change of probability of occurrence of extreme climate events (Kirchmeier-Young et al., 2019). The risk ratio is the ratio of probability of the event exceeding a threshold in a future climate scenario to its real likelihood in the current climate. The RR is defined as the ratio of the probability of an event exceeding a threshold in a future climate scenario to its probability in the current climate. The probabilities are calculated as the integral of the probability density function, above a certain percentile, of the fitted GEV distribution. The threshold is chosen, for all the parameters, as the return level corresponding to 38 years, the period of the time series. A RR >1 indicates increasing probability in the future climate, and RR <1 indicates a decreasing future probability, while a RR of 1 indicates no change in likelihood between the two climates (Slater et al., 2021). In this study, changes in 38-year return value events of extreme temperature, relative humidity, and humidex events were examined. They have a 1-in-38 chance of occurring each year of the reference period.
The GEV model with MLE estimation method has been chosen over other models, for instance GPD, as GEV distribution is very largely used in meteorology and hydrology to describe extreme values (Goubanova & Li, 2007). For modeling maximum temperatures, this model has shown similar results to GPD for the distribution fitting the examined data [Quantile–Quantile (Q–Q) plot], the shape parameters of the distribution and the return levels, unlike GEV with L-moment estimation method that is less appropriate for temperature and precipitation extremes (Lazoglou et al., 2019).
All the calculations were performed using Rstudio integrated development environment for the programming language R for statistical computing and graphics (R Core Team, 2023). ADF test of stationarity used the tseries package (Trapletti & Hornik, 2018). The calculations related to EVT and GEV were performed using extRemes package (Gilleland, 2016; Gilleland & Katz, 2011).
AREA OF STUDY AND DATA DESCRIPTION
The study area is situated in Tunis, the capital of Tunisia, at latitude 36°84′ N, longitude 10°23′ E and 3 m above mean sea level. The data of daily climatological parameters are collected from the department of national air quality monitoring network station at Tunis-Carthage airport. This station was put into service in 1940 but data were consistent only from 1984. The climate data concern measurements of hourly temperatures and relative humidity from 1984 to 2021. There were no missing data.
Humidex is the method of measuring human discomfort due to an excessively hot and humid environment (Masterton & Richardson, 1979). They defined it by the formula:
RESULTS AND DISCUSSION
EVT is applied to 999,288 hourly measurements of temperature and relative humidity from 1984 to 2021 at Tunis (Tunisia), downscaled to a set of 1368 data of the mean monthly values of temperature, relative humidity and calculated humidex.
The data should, first, be checked for their stationarity by using ADF test and PP test. The results show respective ADF statistics values of −5.78 and −11.37 for temperature and relative humidity, and respective PP statistics values of −38.84 and −46.14 for temperature and relative humidity, with calculated p-values at 0.01 in both tests and parameters (Table 1). Both values of both the statistics are more negative than the tabulated (Fuller, 2009; Hamilton, 1994) critical value of −3.41 for this size sample (>500) and considering that there is a seasonal trend. For both ADF and PP tests, at the 95 level, the null hypothesis of a unit root will be rejected. The calculated p-values are less than the significance level of 0.5%, the null hypothesis will be rejected with a 1% risk of error. The alternative hypothesis of stationarity is adopted for both temperature and humidity data.
TABLE 1 Augmented Dickey-Fuller (ADF) and Phillips-Perron (PP) tests results at a significance level of 5%.
| Variable | ADF statistics | p | Alternative hypothesis | PP statistics | p | Alternative hypothesis |
| Temperature | −5.78 | 0.01 | Stationarity | −38.84 | 0.01 | Stationarity |
| Relative humidity | −11.37 | 0.01 | Stationarity | −45.14 | 0.01 | Stationarity |
Before undertaking a detailed statistical analysis, temperature and relative humidity data should be examined through exploratory analysis relatively to normality by drawing the frequency histograms and Q–Q plots. The shape of the distribution of the data for both variables are also checked measuring their symmetry/asymmetry through the skewness and the measure of the sharpness of the peak through kurtosis. The histograms and Q–Q plots overlaid respectively by normal distribution and identity line (1:1) for temperature and relative humidity show a rather good fit that can be improved particularly for extreme values, for example, upper and lower quantiles. So another distribution should be applied to improve the fitting to a model. The skewness was 0.15 for temperature and −0.27 for relative humidity. As these values are close to zero, the distribution of the data for both parameters should be symmetric and equally distributed around the mean. As for kurtosis, it is 1.95 for temperature and 3.48 for relative humidity. As the values do not exceed 7, there is no substantial departure from normality (West et al., 1995). As normal distribution has a kurtosis of three, the distribution of temperature is platykurtic (less than 3) meaning that temperature distribution has a relatively flat-topped curve. Relative humidity kurtosis is very close to 3, so it is mesokurtic, very close to the normal distribution. These measurement are consistent with the graphical observations through histograms and Q–Q plots. The distributions of temperatures and relative humidity are very close to perfect normality, with some flatness in the peak of temperature.
EVT is applied by an univariate extreme value analysis using the generalized extreme value (GEV) model. For GEV estimation, the maxima block method is applied to temperature, relative humidity, and comfort index (humidex). The annual mean monthly observations (N = 456 for each parameter) are separated in several equal blocks (n = 38, which are also the number of years of the study) and each block is a year and contains 12 months (k = 12). The maximum of each block (i.e., of each year) is extracted. The set of maxima that is obtained, called the monthly mean annual maxima, are fitted to a GEV distribution. The respective highest and lowest of this set of temperatures are 30.2°C (July 2003) and 24.9°C (May 2009). For relative humidity, it is respectively 86.0% (January 1986) and 71.3% (January 2008 and December 2019). For humidex, the highest level was 39.4 (August 1994) and the lowest was 28.9 (May 2009).
GEV distribution is applied and diagnostic plots from fitting the GEV distribution are represented (Figures 1, 2 and 3). The adjustments of the model to data are done through a comparison of the distributions expected for the model and the one empirically derived.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
For Tunis monthly mean temperatures (in degree Celsius), the probability, quantile, and density plots (Figure 1a,b,d) show that the theoretical distribution provides a good fit to the empirical data (dots or plain distribution). It is also the case, at a lesser extent, for the monthly mean relative humidity (Figure 2a,b,d) as well as for humidex (Figure 3a,b,d). When looking directly at the estimates of the location parameter μ, the scale parameter δ, and shape parameter ξ, represented in Table 2, or all three parameters, the shape ξ is negative: −0.41 (±0.08) for temperature, −0.14 (±0.07) for relative humidity and −0.4 (±0.1) for humidex. It means that the maxima of the annual monthly mean temperatures, relative humidity, and humidex converge toward the extreme value distribution of Weibull.
TABLE 2 The estimated parameters of the GEV model and their standard error, for temperature and relative humidity data.
| Parameter | Estimated parameter | Standard error | |
| Temperature | Location μ | 23.6 | 0.3 |
| Scale δ | 1.1 | 0.2 | |
| Shape ξ | −0.41 | 0.08 | |
| Relative humidity | Location μ | 69.6 | 0.4 |
| Scale δ | 3.1 | 0.2 | |
| Shape ξ | −0.14 | 0.07 | |
| Humidex | Location μ | 32.5 | 0.4 |
| Scale δ | 2.3 | 0.3 | |
| Shape ξ | −0.4 | 0.1 |
Maximizing the likelihood under the hypothesis of a GEV law gives the maximum likelihood estimator for all distributions of the class of GEV probabilistic distributions. In our case, all ξ are higher than −0.5, so according to Smith (1985), it shows the existence of an asymptotically normal and efficient maximum likelihood estimator and that the classical asymptotic properties of maximum likelihood estimators hold through this range.
Once the best model for the data is found, the interest is to derive the return levels of maximum temperature, relative humidity, and humidex. For all three parameters (c. plots of Figures 1, 2, and 3), the higher the maximum mean, the higher the return period associated. For each period level, a return level can be associated, according to the GEV model (Table 3). For return periods varying from 5 to 350 years, the monthly mean maxima varies for temperature from 28.2°C to 30.3°C, for relative humidity from 81.4% to 88.7% and for humidex from 37.0 to 39.4. The highest monthly mean temperature of 30.2°C and humidex 39.4 of our period of study, are expected to be exceeded only every 300–350 years. According to these results, the highest monthly mean temperature and humidex will seldomly be exceeded or reached in the future, they can thus be considered rare extreme events. On the another side, the highest mean monthly relative humidity of 86.0% of our period of study is expected to be exceeded every 50 years or a little earlier. So, according to this model, the occurrence of high humidity phenomena will be more frequent than high temperature or humidex events. It has already been suggested that when using GEV model, time series of comparable length than the return period of the model should be used, for better estimates (Huang et al., 2016). It would be wise to study the parameters within the length of the times series observations for calculating for the risk ratio.
TABLE 3 Return periods associated to return levels for temperature, relative humidity and humidex for their annual monthly mean maxima.
| Return period (years) | Return level | ||
| Temperature (°C) | Relative humidity (%) | Humidex | |
| 5 | 28.15 | 81.35 | 37.01 |
| 20 | 29.26 | 84.17 | 38.32 |
| 50 | 29.58 | 86.15 | 39.02 |
| 70 | 29.89 | 86.68 | 39.13 |
| 100 | 29.94 | 87.85 | 39.28 |
| 200 | 29.99 | 88.04 | 39.36 |
| 250 | 30.02 | 88.26 | 39.37 |
| 300 | 30.10 | 88.58 | 39.39 |
| 350 | 30.30 | 88.73 | 30.42 |
If we consider 38 years of occurrence, the risk ratio is equal to 1 for temperature but lower than 1 for relative humidity and humidex. This indicates that maximum temperatures values will probably not vary in the future, nor increase, nor decrease, for this region, compared to the 38 past years. Maximum relative humidity and humidex values are expected to decrease in the future in this region, as the risk ratio was 0.5 so less than 1.
According to this study, the highest monthly mean temperatures are expected to be stable for the next 38 years, with regard to the 38 past years and the threshold of the maximum mean temperature reached during the study is a very rare event. As for highest monthly mean relative humidity values, they are expected to decrease during the 38 future years, even if the maximum observation reached the study will not be so rarely exceeded. The humidex monthly mean maxima are tending to decrease and the maximum value of the study will very rarely been exceeded in the future. According to these results, and even if the maximum temperatures will not vary significantly, the perceived maximum temperature, taking into account humidity, will be more bearable. The weather will feel a little more comfortable during the next 38 years. These results are obtained from monthly means. During the same period, using the indices percentiles, when looking at the evolution of the percentage of hot days (TX90p) and nights (TN90p), it fluctuates from year to year but globally the linear trends are upward (slope at 0.165 and 0.217, respectively). The percentages of cold days (TX10p) and cold nights (TN10p) have a downward trend (slope at −0.271 and −0.368, respectively). Additionally, the diurnal temperature range (DTR) as the difference between the mean daily maximum and minimum temperature for each year, shows an almost flat trend for its evolution through the years of study. These results are obtained from mean daily values.
Applying GEV model, we considered monthly means when looking for maxima. These statistical means, by definition, include the daily maximum but also the daily minimum and all the other values, along the month. And only after this step do we consider the maxima of each year by using the block maxima technique. Accordingly, there is a compensation due to all the lower daily values that could bias the results. The indices percentile and DTR results indicates that maximum temperatures are increasing as well as minimum, by approximately the same amount. It does not help explain the stability of temperature maxima found in our study by GEV model. Consequently, it would, perhaps, be more accurate to use the daily maxima per se or use monthly mean daily maxima in the application of the extreme value analysis, at least to compare the different uses of the data. The behavior of the extremes could be improved and the accuracy of the projections increased if daily minimum and maximum are considered as well as the shorter time interval possible between the variables, during the collection of climatic data.
AUTHOR CONTRIBUTIONS
Semia Cherif: Conceptualization; investigation; writing – original draft; methodology; validation; visualization; writing – review and editing; software; supervision; data curation; project administration; formal analysis; resources.
ACKNOWLEDGEMENTS
The author thank Oussema Klai, Seifeddine Zouari, and Khouloud Dhifalli for their preliminary technical assistance; a special thank also to Nacer Boukhris for his initiation of the database.
FUNDING INFORMATION
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
CONFLICT OF INTEREST STATEMENT
The author declares no conflicts of interest.
DATA AVAILABILITY STATEMENT
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
PERMISSION TO REPRODUCE MATERIAL FROM OTHER SOURCES
No material is reproduced from other sources.
Babaousmail, H., Ayugi, B., Rajasekar, A., Zhu, H., Oduro, C., Mumo, R. et al. (2022) Projection of extreme temperature events over the Mediterranean and Sahara using bias‐corrected CMIP6 models. Atmosphere, 13(5), 741.
Baouab, M.H. & Cherif, S. (2015) Climate change and water resources: trends, fluctuations and projections for a case study of potable water in Tunisia. Houille Blanche‐Revue Internationale de l’ Eau, 5, 99–107.
Christidis, N., Mitchell, D. & Stott, P.A. (2023) Rapidly increasing likelihood of exceeding 50°C in parts of the Mediterranean and the Middle East due to human influence. npj Climate and Atmospheric Science, 6(1), 45.
Coles, S.G. & Dixon, M.J. (1999) Likelihood‐based inference for extreme value models. Extremes, 2, 5–23.
Deutsch, C., Ferrel, A., Seibel, B., Pörtner, H.‐O. & Huey, R.B. (2015) Climate change tightens a metabolic constraint on marine habitats. Science, 348(6239), 1132–1135. Available from: [DOI: https://dx.doi.org/10.1126/science.aaa1605]
West, S.G., Finch, J.F. & Curran, P.J. (1995) Structural equation models with non normal variables: problems and remedies. In: Hoyle, R.H. (Ed.) Structural equation modeling: concepts, issues, and applications. Thousand Oaks, CA: Sage, pp. 56–75.
Wilks, D.S. (2006) Statistical methods in the atmospheric sciences. Amsterdam, The Netherlands: Elsevier.
WWAP. (2015) The United Nations World Water Development report 2015: water for a sustainable world. Paris, France: UNESCO.
Zaninović, K., Gajić‐Čapka, M., Perčec‐Tadić, M., Vučetić, M., Milković, J., Bajič, A. et al. (2008) Climate atlas of Croatia: 1961–1990, 1971–2000. Državni hidrometeorološki zavod: Zagreb, Croatia.
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.