A Weighted Skew-Logistic Distribution with

Full text

Turn on search term navigation

1. Introduction

The skew-logistic distribution is a continuous probability distribution used in the modeling of skewed unimodal data. Considering a logistic baseline distribution, the skew-logistic distribution can be understood as a member of the skewed distributions class proposed by Azzalini [1]. Specifically, the random variable X follows a skew-logistic distribution if its probability density function (PDF) is given by

(1) $\begin{matrix} f_{X} (x) & = & \frac{2 e^{- x}}{{(1 + e^{- x})}^{2} (1 + e^{- λ x})}, x \in R, \end{matrix}$

where

λ \in R

is a skewness parameter. This is usually denoted as

X \sim SLOG (λ)

The PDF in (1) is characterized by having a flared shape that can be symmetric or asymmetric, depending on $λ$ . If $λ < 0$ , then a left-skewed PDF is achieved. If $λ > 0$ , then a right-skewed PDF is achieved. If $λ = 0$ , then the PDF is symmetric and reduces to the PDF of the classic logistic distribution.

The r-th moment of $X \sim SLOG (λ)$ can be written as $E (X^{r}) = 2 I_{r} (λ)$ , where

(2) $\begin{matrix} I_{r} & = & \int_{0}^{1} {[- \log (\frac{1 - u}{u})]}^{r} \frac{u^{λ}}{u^{λ} + {(1 - u)}^{λ}} d u, r = 1, 2, \dots \end{matrix}$

The skew-logistic distribution is considered a natural alternative to generalized logistic distributions discussed in Johnson et al. [2]. A comprehensive description of the mathematical properties of the skew-logistic distribution can be found in Nadarajah [3] and Gupta and Kundu [4].

Although the skew-logistic distribution can perform appropriately in a wide variety of settings where the data exhibit unimodality, it performs poorly in the presence of multimodality; that is, when there are multiple modes or peaks in the empirical distribution. The presence of multimodality can be explained by different reasons, including the existence of multiple groups or sub-populations with unique characteristics, or the existence of latent variables that significantly influence the distribution of the population. In such cases, a mixture distribution is one of the first alternatives considered for modeling; however, its use involves dealing with the non-identifiability problem. An extensive discussion of the logistic mixture model can be found in Rost [5]. Details of the non-identifiability problem in mixture models can be consulted in Aitkin and Rubin [6] and McLachlan et al. [7].

Various methods to introduce new flexible probability distributions can be found in the statistical literature. There are many examples that we could mention, but the approaches proposed in Elal-Olivero [8], Gómez et al. [9], Venegas et al. [10] and Bolfarine et al. [11] are especially attractive when trying to propose a new bimodal distribution.

A very popular methodology in the literature is that of weighted distributions proposed by Fisher [12] and Rao [13]. Suppose that X is a random variable with probability function $f (x)$ . The weighted random variable $X_{w}$ has PDF

(3) $\begin{matrix} f_{X_{w}} (x) = \frac{w (x) f (x)}{μ_{w}}, \end{matrix}$

where

w (\cdot)

is a nonnegative weight function and

μ_{w} = E [W (X)] < \infty

A particularly outstanding case of weighted distributions is obtained by considering $w (x) = x$ , which defines a size-biased (or length-biased) distribution. Size-biased distributions arise naturally in applied fields, such as reliability and survival analysis, when individuals or mechanical units are sampled with unequal probability due to the design of the experiment or due to the existing unequal probability of detection.

In this paper, we propose a weighted version of the skew-logistic distribution that can present asymmetric shapes with up to three modes. The new distribution arises from Equation (3), considering that $f (\cdot)$ is the PDF of the skew-logistic distribution and $w (\cdot)$ is a parametric function that we will describe in Section 2. We provide evidence that the performance of the new distribution, being flexible both in skewness and in shapes involving bimodality, can outperform important distributions in the literature, even outperform the logistic mixture model.

The rest of this article is organized as follows. In Section 2, fundamental properties of the new distribution, such as the PDF, cumulative distribution function, raw moments, and Fisher’s skewness coefficient, are derived. In Section 3, parameter estimation via the maximum likelihood method is discussed. In Section 4, an application example that considers two environmental datasets illustrates the usefulness of the proposal. Finally, Section 5 reports some concluding remarks.

2. The New Distribution and Its Properties

This section proposes the new probability distribution and studies some of its fundamental properties, such as the PDF, the cumulative distribution function (CDF), and the raw moments. In addition, a detailed description of the PDF shapes is developed.

2.1. Weighted Skew-Logistic Distribution

The following proposition presents the PDF of the new distribution.

Proposition 1.

Let $X \sim SLOG (λ)$ and $w (\cdot)$ be a parametric function given by

$w (x) = \frac{1}{1 + α κ_{λ}} [1 + α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4}], x, λ \in R, α > 0,$

where

(4) $\begin{matrix} μ_{λ} = E (X) & = & 2 I_{1}, \\ σ_{λ}^{2} = V a r (X) & = & 2 (I_{2} - 2 I_{1}^{2}), \\ κ_{λ} = K (X) & = & \frac{I_{4} - 8 I_{1} I_{3} + 24 I_{1}^{2} I_{2} - 24 I_{1}^{4}}{2 {(I_{2} - 2 I_{1}^{2})}^{2}}, \end{matrix}$

such that $I_{r} (λ)$ , with $r = 1, 2, 3, 4$ , is as in Equation (2). Then, the PDF of the weighted random variable $X_{w}$ is

(5) $\begin{matrix} f_{X_{w}} (x; λ, α) = \frac{2}{1 + α κ_{λ}} [1 + α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4}] \frac{e^{- x}}{{(1 + e^{- x})}^{2} (1 + e^{- λ x})}, \end{matrix}$

where $x \in R$ , $λ \in R$ and $α > 0$ .

Proof.

If $w (x) = {1 + α {[(x - μ_{λ}) / σ_{λ}]}^{4}} / (1 + α κ_{λ})$ , the result in (5) is obtained directly by substituting (1) into (3). □

Remark 1.

1.
Considering that $κ_{λ} = K (X) > 4$ , see [3], we note that $w (x) = \frac{1}{1 + α κ_{λ}} [1 + α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4}]$ is a positive function for all $α > 0$ and $x, λ \in R$ .
2.
From Equation (5), considering $C_{0} = 1 / (1 + α κ_{λ})$ , it is verified that
$\begin{matrix} \int_{- \infty}^{\infty} f_{X_{w}} (x; λ, α) d x & = & \int_{- \infty}^{\infty} C_{0} [1 + α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4}] f_{X_{w}} (x; λ) d x \\ = & C_{0} [\int_{- \infty}^{\infty} f_{X_{w}} (x; λ, α) d x + \int_{- \infty}^{\infty} α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4} f_{X_{w}} (x; λ, α) d x] \\ = & C_{0} \{1 + α E [{(\frac{X - μ_{λ}}{σ_{λ}})}^{4}]\} \\ = & C_{0} (1 + α κ_{λ}) \\ = & 1 . \end{matrix}$

Definition 1.

Let $X_{w}$ be a random variable with PDF given in (5), then we say that $X_{w}$ follows a weighted skew-logistic distribution. We will denote this as $X_{w} \sim WSLOG (λ, α)$ .

Corollary 1.

If $X_{w} \sim WSLOG (λ, α)$ , then the random variable $Y = μ + σ X_{w}$ , with $μ \in R$ and $σ > 0$ , follows the weighted skew-logistic distribution with location parameter μ and scale parameter σ. The PDF of Y is given by

(6) $\begin{matrix} f_{Y} (y; μ, σ, λ, α) & = & \frac{2}{σ (1 + α κ_{λ})} [1 + α {(\frac{y - μ_{1}^{'}}{σ_{1}})}^{4}] \\ \times \frac{e^{- z}}{{(1 + e^{- z})}^{2} (1 + e^{- λ z})}, \end{matrix}$

where $y \in R$ , $z = \frac{y - μ}{σ}$ , $λ \in R$ , $α > 0$ , $μ_{1}^{'} = 2 σ I_{1} + μ$ and $σ_{1}^{2} = 2 σ^{2} (I_{2} - 2 I_{1}^{2})$ , such that $I_{j}$ , $j = 1, 2$ is as in (2).

We denote this as $Y \sim WSLOG (μ, σ, λ, α)$ .

Figure 1 shows some PDF curves of the WSLOG distribution for different values of its parameters. In the figure, it can be seen that the WSLOG PDF can display shapes with up to three modes. These shapes will be described in more detail in Section 2.4.

2.2. Special Cases

The following proposition details the relationship between the WSLOG distribution with the logistic and SLOG distributions.

Proposition 2.

Let $X_{w} \sim WSLOG (λ, α)$ . Then,

1.
$\lim_{α \to 0} f_{X_{w}} (x, 0, α) = \frac{e^{- x}}{{(1 + e^{- x})}^{2}}$ .
2.
$\lim_{α \to 0} f_{X_{w}} (x, λ, α) = \frac{2 e^{- x}}{{(1 + e^{- x})}^{2} (1 + e^{- λ x})}$ .
3.
$f_{X_{w}} (x, 0, α) = [1 + α {(\frac{x - μ_{0}}{σ_{0}})}^{4}] \frac{e^{- x}}{{(1 + e^{- x})}^{2}}$ .

Proof.

These results are the direct consequence of analyzing the limit case $α \to 0$ of the WSLOG distribution, together with the special case $λ = 0$ of the SLOG distribution. □

Part (1) of Proposition 2 shows that WSLOG PDF tends to SLOG PDF as $α \to 0$ . Part (2) shows that the WSLOG PDF tends to the logistic PDF as $α \to 0$ and $λ = 0$ . Part (3) presents a new PDF that can be understood as a bimodal extension of the logistic distribution. This new PDF is capable of displaying bimodal shapes while inheriting the symmetry feature of the logistic PDF.

2.3. Distribution Function and Related

In this section, we derive the cumulative distribution function (CDF) of the WSLOG distribution. This result will be used to compute goodness-of-fit tests in the application example of Section 4.

Proposition 3.

Let $Y \sim WSLOG (μ, σ, λ, α)$ . Then, the CDF of Y is

(7) $\begin{matrix} F_{Y} (y; μ, σ, λ, α) = \frac{σ_{1}^{4} F_{X} (y; μ, σ, λ) + 2 α J (y; μ, σ, λ)}{σ_{1}^{4} (1 + α κ_{λ})}, \end{matrix}$

where $F_{X} (\cdot; \cdot, \cdot, \cdot)$ is the CDF of the location and scale version of the SLOG distribution, $σ_{1}$ y $κ_{λ}$ are as in Equation (6) and $J (\cdot; \cdot, \cdot, \cdot)$ is given by

$\begin{matrix} J (y; μ, σ, λ) = \int_{- \infty}^{\frac{y - μ}{σ}} \frac{{(t σ + μ - μ_{1})}^{4}}{(1 + e^{- λ t})} \frac{e^{- t}}{{(1 + e^{- t})}^{2}} d t . \end{matrix}$

Proof.

Denoting as $f_{X} (x; μ, σ, λ)$ and $F_{X} (x; μ, σ, λ)$ the PDF and the CDF of the SLOG distribution, respectively, from Equation (6), it is obtained that

$\begin{matrix} F_{Y} (y; μ, σ, λ, α) & = & \int_{- \infty}^{y} f_{Y} (u; μ, σ, λ, α) d u \\ = & \frac{1}{1 + α κ_{λ}} \int_{- \infty}^{y} [1 + α {(\frac{z - μ_{1}}{σ_{1}})}^{4}] f_{X} (u; μ, σ, λ) d u \\ = & \frac{1}{σ_{1}^{4} (1 + α κ_{λ})} [σ_{1}^{4} F_{X} (y; μ, σ, λ) + 2 α \int_{- \infty}^{\frac{y - μ}{σ}} \frac{{(t σ + μ - μ_{1})}^{4}}{(1 + e^{- λ t})} \frac{e^{- t}}{{(1 + e^{- t})}^{2}} d t], \end{matrix}$

and denoting the above integral as

J (y; μ, σ, λ)

, the result in (7) is obtained. □

The following results are a direct consequence of the Proposition 3 and the Corollary 1.

Corollary 2.

If $Y \sim WSLOG (μ, σ, λ, α)$ , the survival function (SF) of Y is

$\begin{matrix} S_{Y} (y; μ, σ, λ, α) & = & 1 - F_{Y} (y; μ, σ, λ, α) \\ = & 1 - \frac{σ_{1}^{4} F_{X} (y; μ, σ, λ) + 2 α J (y; μ, σ, λ)}{σ_{1}^{4} (1 + α κ_{λ})} . \end{matrix}$

Corollary 3.

If $Y \sim WSLOG (μ, σ, λ, α)$ , the hazard rate function (HRF) of Y is

$\begin{matrix} h_{Y} (y; μ, σ, λ, α) & = & \frac{f_{Y} (y; μ, σ, α, λ)}{1 - F_{Y} (y; μ, σ, λ, α)} \\ = & \frac{[σ_{1}^{4} + α {(y - μ_{1})}^{4}] f_{X} (y; μ, σ, λ)}{σ_{1}^{4} F_{X} (y; μ, σ, λ) + 2 α J (y; μ, σ, λ)} . \end{matrix}$

Figure 2 presents some curves for the HRF of the WSLOG distribution for different parameter values. In the figure, it is possible to see that this function can present a roller coaster shape (increasing–decreasing–increasing).

2.4. Shapes

In order to provide more details regarding the shapes of the WSLOG distribution, this section provides analytical expressions that allow the identification of local maximum (minimum) values and inflection points for the WSLOG PDF. The behavior of these expressions is illustrated by means of some graphical representations.

If $X_{w} \sim WSLOG (λ, α)$ , to obtain the partial derivatives of the PDF of $X_{w}$ , we rewrite Equation (5) as

$f_{X_{w}} (x; λ, α) = C x_{1} x_{2} x_{3},$

where

$C = \frac{2}{1 + α κ_{λ}}, x_{1} = \frac{e^{- x}}{{(1 + e^{- x})}^{2}}, x_{2} = \frac{1}{1 + e^{- x}}, x_{3} = 1 + α {(\frac{x - μ_{λ}}{σ_{λ}})}^{4} .$

So, the first and second partial derivative of the WSLOG PDF is

(8) $\begin{matrix} \frac{\partial f_{X_{w}} (x; λ, α)}{\partial x} & = & C λ x_{1} x_{3} x_{4} + \frac{4 α C (x - μ_{λ})}{σ_{λ}^{4}} x_{1} x_{2} + C x_{1}^{'} x_{2} x_{3}, \end{matrix}$

(9) $\begin{matrix} \frac{\partial^{2} f_{X_{w}} (x; λ, α)}{\partial x^{2}} & = & \frac{8 C α}{σ_{λ}^{4}} {(x - μ_{λ})}^{3} x_{1} x_{4} + \frac{12 C α}{σ_{λ}^{4}} {(x - μ_{λ})}^{2} x_{1} x_{2} + C λ x_{3} x_{4} x_{1}^{'} \\ + \frac{8 C α}{σ_{λ}^{4}} {(x - μ_{λ})}^{3} x_{1}^{'} x_{2} + C λ^{2} x_{1} x_{3} x_{4}^{'} + C x_{1}^{″} x_{2} x_{3}, \end{matrix}$

where

$x_{1}^{'} = \frac{e^{- x} (e^{- x} - 1)}{{(1 + e^{- x})}^{3}}, x_{1}^{″} = \frac{e^{- x} (1 - 4 e^{- x} + e^{- 2 x})}{{(1 + e^{- x})}^{4}}, x_{4}^{'} = \frac{λ e^{- λ x} (e^{- λ x} - 1)}{{(1 + e^{- λ x})}^{3}} .$

The modes, antimodes and abscissas of the inflection points of the WSLOG PDF can be obtained by calculating the roots of Equations (8) and (9). The analytical complexity of these equations makes it difficult to obtain closed analytical expressions for these quantities. However, once $λ$ and $α$ are known, it is possible to obtain approximations of the roots using numerical procedures such as the Newton–Raphson method.

Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 present profiles of the critical points and inflection points equations (Equations (8) and (9)) and the WSLOG PDF curves considering different choices of $λ$ and $α$ . In the figures, it can be seen that the WSLOG PDF can present a unimodal shape with two, four, or six inflection points, a bimodal shape with four inflection points and a trimodal shape with six inflection points.

Considering $λ \in [- 5, 5]$ and $α \in [0, 5]$ , Figure 8 illustrates the behavior of the WSLOG PDF in terms of the number of modes it presents. The figure reveals three differentiated regions; unimodality region, bimodality region, and a region where the PDF has three modes. Taking into account that $λ$ is a skewness parameter inherited from the SLOG distribution, we observe that when $λ = 0$ (symmetric case), the WSLOG PDF is unimodal or trimodal. On the other hand, bimodal shapes are obtained when $α > 0$ and $λ \neq 0$ .

2.5. Moments and Skewness Behavior

In this section, the moments of the WSLOG distribution are derived and, from these, the behavior of the skewness is described.

Proposition 4.

Let $Y \sim WSLOG (μ, σ, λ, α)$ . Hence, for $r = 1, 2, 3, \dots$ , we have

(10) $\begin{matrix} E [Y^{r}] & = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{r}^{'} \\ + \frac{α}{σ_{1}^{4}} (μ_{r + 4}^{'} - 4 μ_{1}^{'} μ_{r + 3}^{'} + 6 μ_{1}^{' 2} μ_{r + 2}^{'} - 4 μ_{1}^{' 3} μ_{r + 1}^{'})], \end{matrix}$

where $κ_{λ}$ is as in Equation (4), $σ_{1}^{2} = 2 σ^{2} (I_{2} - I_{1}^{2})$ and $μ_{r}^{'} = \sum_{j = 0}^{r} (\binom{r}{j}) σ^{j} I_{j} μ^{r - j}$ , such that $I_{0} = 1$ and $I_{j}$ , $j = 1, \dots, r$ , is as in Equation (2).

Proof.

From Equation (6), we see that

$\begin{matrix} E [Y^{r}] & = & \int_{- \infty}^{\infty} \frac{y^{r}}{1 + α κ_{λ}} [1 + α {(\frac{y - μ_{1}^{'}}{σ_{1}})}^{4}] \frac{2 e^{- z}}{σ {(1 + e^{- z})}^{2} (1 + e^{- λ z})} d y \\ = & \int_{- \infty}^{\infty} \frac{1}{1 + α κ_{λ}} [y^{r} + \frac{α}{σ_{1}^{4}} (y^{r + 4} - 4 y^{r + 3} μ_{1}^{'} + 6 y^{r + 2} μ_{1}^{' 2} - 4 y^{r + 1} μ_{1}^{' 3} + y^{r} μ_{1}^{' 4})] \\ \times \frac{2 e^{- z}}{σ {(1 + e^{- z})}^{2} (1 + e^{- λ z})} d y . \end{matrix}$

Recognizing in the previous integral the r-th moment of the SLOG distribution, $μ_{r}^{'} = \sum_{j = 0}^{r} (\binom{r}{j}) σ^{j} I_{j} μ^{r - j}$ , such that $I_{r}$ is as in Equation (2), we obtain that

$\begin{matrix} E [Y^{r}] & = & \frac{1}{1 + α κ_{λ}} [μ_{r}^{'} + \frac{α}{σ_{1}^{4}} (μ_{r + 4}^{'} - 4 μ_{r + 3}^{'} μ_{1}^{'} + 6 μ_{r + 2}^{'} μ_{1}^{' 2} - 4 μ_{r + 1}^{'} μ_{1}^{' 3} + μ_{r}^{'} μ_{1}^{' 4})] \\ = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{r}^{'} + \frac{α}{σ_{1}^{4}} (μ_{r + 4}^{'} - 4 μ_{1}^{'} μ_{r + 3}^{'} + 6 μ_{1}^{' 2} μ_{r + 2}^{'} - 4 μ_{1}^{' 3} μ_{r + 1}^{'})] . \end{matrix}$

which is the result given in Equation (10). □

Corollary 4.

Let $Y \sim WSLOG (μ, σ, λ, α)$ . Then, the mean ( $E (Y)$ ), variance ( $V a r (Y)$ ), and Fisher’s skewness (S) and kurtosis (K) coefficients of Y are

$\begin{matrix} E (Y) & = & μ_{1}, \\ V a r (Y) & = & μ_{2} - μ_{1}^{2}, \\ S & = & \frac{μ_{3} - 3 μ_{1} μ_{2} + 2 μ_{1}^{3}}{{(μ_{2} - μ_{1}^{2})}^{3 / 2}} \\ and \\ K & = & \frac{μ_{4} - 4 μ_{1} μ_{3} + 6 μ_{1}^{2} μ_{2} - 3 μ_{1}^{4}}{{(μ_{2} - μ_{1}^{2})}^{2}}, \end{matrix}$

where

$\begin{matrix} μ_{1} & = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{1}^{'} + \frac{α}{σ_{1}^{4}} (μ_{5}^{'} - 4 μ_{1}^{'} μ_{4}^{'} + 6 μ_{1}^{' 2} μ_{3}^{'} - 4 μ_{1}^{' 3} μ_{2}^{'})], \end{matrix}$

$\begin{matrix} μ_{2} & = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{2}^{'} + \frac{α}{σ_{1}^{4}} (μ_{6}^{'} - 4 μ_{1}^{'} μ_{5}^{'} + 6 μ_{1}^{' 2} μ_{4}^{'} - 4 μ_{1}^{' 3} μ_{3}^{'})], \\ μ_{3} & = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{3}^{'} + \frac{α}{σ_{1}^{4}} (μ_{7}^{'} - 4 μ_{1}^{'} μ_{6}^{'} + 6 μ_{1}^{' 2} μ_{5}^{'} - 4 μ_{1}^{' 3} μ_{4}^{'})], \\ μ_{4} & = & \frac{1}{1 + α κ_{λ}} [(1 + \frac{α μ_{1}^{' 4}}{σ_{1}^{4}}) μ_{4}^{'} + \frac{α}{σ_{1}^{4}} (μ_{8}^{'} - 4 μ_{1}^{'} μ_{7}^{'} + 6 μ_{1}^{' 2} μ_{6}^{'} - 4 μ_{1}^{' 3} μ_{5}^{'})], \end{matrix}$

such that $μ_{j}^{'}$ , with $j = 1, 2, 3, 4$ , is as in Equation (10).

Figure 9 presents 3-D and 2-D perspectives for the Fisher’s skewness coefficient of the WSLOG distribution. From the figure, we note the following: (1) From the 2-D perspective, it can be seen that if $λ = 0$ , then the WSLOG PDF is symmetric. (2) From the 3-D perspective, it can be seen that the highest skewness values are obtained when considering small values of $α$ (close to 0) and high values of $λ$ ; that is, in the unimodal case, by complementing this with the results obtained from Figure 8. (3) In both perspectives, it can be seen that for a fixed $λ$ , the skewness can increase or decrease as $α$ grows. However, we can note that $α$ has a smooth effect on the skewness.

2.6. Possible Application Scenarios

Focusing on the PDF shapes described in Section 2.4, we next consider some possible application scenarios for the WSLOG distribution.

For the unimodal case, there are many possible scenarios of application that we can mention. We considered the following: (1) Hydrology: to model hydrological variables such as river flows and water levels. These variables often exhibit unimodality and asymmetry. (2) Economy and Finance: to model asset returns, volatility, and other financial data that may exhibit unimodal and asymmetric behavior. (3) Environment: to model data related to air pollution, concentrations of chemical substances, and other types of environmental data. (4) Demography and Social Sciences: to model data such as the age distribution of a population, migration rates, and other demographic variables.

In general, the WSLOG distribution, in its unimodal shape, can be considered an alternative distribution for data modeling in scenarios where the skew-logistic distribution is used. The fact that the PDF WSLOG can have more than two inflection points (in its unimodal shape) can lead to better fits of datasets whose frequency distributions are unimodal but with extravagant shapes.

Focusing on the bimodal shapes, we believe that the WSLOG distribution can be considered a viable alternative for modeling data associated with environmental variables. When analyzing this type of variable, it is common to find data that exhibit bimodality due to factors such as special geographic characteristics and seasonality. For example, the wind speed in the Canary Islands may exhibit bimodality due to the interaction of the trade winds with the mountainous and volcanic topography of this region. This can cause local disturbances in certain areas, which increases the wind speed in those specific areas, see [14]. Another example that we can mention is related to the behavior of the distribution of rainfall in Central America. In this region, precipitation shows a bimodal behavior related to the influence of the rainy seasons in the Caribbean and the Pacific, see [15].

Both in the bimodal and trimodal case, the WSLOG distribution can be used to analyze data in scenarios where it is possible to identify sub-populations with differentiated behavior with respect to the characteristic of interest studied in the population. In this case, the WSLOG distribution can be considered an alternative to the popular mixture models, especially the logistic mixture model.

3. Parameter Estimation

The problem of estimating parameters for the WSLOG distribution using the maximum likelihood method is discussed in this section. In addition, simulation experiments are carried out to assess the performance of the estimators.

3.1. Maximum Likelihood Estimators

For a random sample $Y_{1}, \dots, Y_{n}$ of the $WSLOG (θ)$ distribution, with $θ = (μ, σ, λ, α)$ . Then, the log-likelihood function associated with $θ$ is written as

(11) $\begin{matrix} ℓ (θ; y) & = & n \log (2) - n \log (σ) - n \log (1 + α κ_{λ}) + \sum_{i = 1}^{n} \log (1 + α z_{1 i}^{4}) \\ - \sum_{i = 1}^{n} z_{i} - 2 \sum_{i = 1}^{n} \log (1 + e^{- z_{i}}) - \sum_{i = 1}^{n} \log (1 + e^{- λ z_{i}}), \end{matrix}$

where

y = {(y_{1}, \dots, y_{n})}^{⊤}

are the vector of observed values,

z_{i} = (y_{i} - μ) / σ

and

z_{1 i} = (y_{i} - μ_{1}^{'}) / σ_{1}

, for

i = 1, \dots, n

. Thus, the elements of the score vector associated with the log-likelihood function (11) are given by

$\begin{matrix} \frac{\partial ℓ (θ; y)}{\partial μ} & = & - \frac{4 α}{\sqrt{2 σ^{2} (I_{2} - 2 I_{1}^{2})}} \sum_{i = 1}^{n} \frac{z_{1 i}^{3}}{(1 + α z_{1 i}^{4})} - \frac{2}{σ} \sum_{i = 1}^{n} \frac{\exp (- z_{i})}{(1 + \exp (- z_{i}))} \\ - \frac{λ}{σ} \sum_{i = 1}^{n} \frac{\exp (- λ z_{i})}{(1 + \exp (- λ z_{i}))} + \frac{n}{σ}, \\ \frac{\partial ℓ (θ; y)}{\partial σ} & = & - \frac{n}{σ} + \sum_{i = 1}^{n} \frac{z_{i}}{σ} - \frac{2}{σ} \sum_{i = 1}^{n} \frac{z_{i} \exp (- z_{i})}{(1 + \exp (- z_{i}))} \\ - \frac{λ}{σ} \sum_{i = 1}^{n} \frac{z_{i} \exp (- λ z_{i})}{(1 + \exp (- λ z_{i}))} - \frac{4 α}{\sqrt{2 σ^{2} (I_{2} - 2 I_{1}^{2})}} \sum_{i = 1}^{n} \frac{z_{1 i}^{3} z_{i}}{(1 + α z_{1 i}^{4})}, \\ \frac{\partial ℓ (θ; y)}{\partial λ} & = & \sum_{i = 1}^{n} \frac{z_{i} \exp (- λ z_{i})}{(1 + \exp (- λ z_{i}))}, \\ \frac{\partial ℓ (θ; y)}{\partial α} & = & \sum_{i = 1}^{n} \frac{z_{1 i}^{4}}{(1 + α z_{1 i}^{4})} - \frac{n κ_{λ}}{(1 + α κ_{λ})} . \end{matrix}$

Since the ML estimators of $μ$ , $σ$ , $α$ , and $λ$ do not have closed expressions, it is necessary to use numerical procedures such as Newton–Raphson.

The standard errors of the ML estimators can be obtained as the square roots of the elements of the diagonal of the matrix

$\begin{matrix} K^{- 1} (\hat{θ}) & = & {\{- \frac{\partial^{2} ℓ (θ; y)}{\partial θ \partial θ^{⊤}} |_{\begin{matrix} θ = \hat{θ} \end{matrix}}\}}^{- 1}, \end{matrix}$

where

\partial^{2} ℓ (θ; y) / \partial θ \partial θ^{⊤}

is the Hessian matrix.

Alternatively, it is possible to solve the problem of maximizing Equation (11) with the help of some numerical optimization routine such as the function stats:optim() of the R programming language [16]. In this case, minimizing the negative log-likelihood, this function returns the ML estimates and the numerical Hessian matrix. The R code used is available at https://github.com/isaaccortes1989/WSLOG-ARTICLE, accessed on 20 March 2024.

3.2. Simulation Study

In this section, we carry out a simulation study aimed at evaluating the behavior of the ML estimators of the parameters of the WSLOG distribution.

3.2.1. Simulation Algorithm

We used the accept-reject method to generate pseudo-random numbers from the WSLOG distribution. The results in a sequence of n pseudo-random numbers were stored inside an array that we called $n_{v e c t o r}$ . We considered the envelope function $g (x) = c f (x; m, s)$ , where $f (x; m, s) = e^{- (x - m) / s} {[1 + e^{- (x - m) / s}]}^{- 2}$ , $x, m \in R$ , $s > 0$ , is the PDF of the logistic distribution. Thus, from Equation (6), to determine the value of the constant c, we calculated the root of the equation

(12) $\begin{matrix} 0 & = & \frac{\partial}{\partial x} \frac{[1 + α {(\frac{x - μ_{1}^{'}}{σ_{1}})}^{4}] e^{- \frac{x - μ}{σ}} {(1 + e^{- \frac{x - m}{s}})}^{2}}{{(1 + e^{- \frac{x - μ}{σ}})}^{2} (1 + e^{- λ \frac{x - μ}{σ}}) e^{- \frac{x - m}{s}}} . \end{matrix}$

In addition to the parameters $μ$ , $σ$ , $λ$ and $α$ of the WSLOG distribution, to build the algorithm (Algorithm 1), we needed to define:

n: The length of the $n_{v e c t o r}$ .
$μ$ , $σ$ , $λ$ y $α$ : The parameters of the WSLOG distribution.
m y s: The location and scale parameters of the logistic distribution.
$f_{Y} (\cdot; μ, σ, λ, α)$ : The WSLOG PDF whit $μ \in R$ , $σ > 0$ , $λ \in R$ and $α > 0$ .
$f (\cdot; m, s)$ : The logistic PDF with $m \in R$ and $s > 0$ .
$U_{1}$ : A random variable with a uniform $(0, 1)$ distribution.
$U_{2}$ : A random variable with a uniform $(0, 1)$ distribution.

Algorithm 1: Accept-Reject Algorithm to Generate Pseudo Random Numbers from the WSLOG

(μ, σ, λ, α)

Distribution.

3.2.2. Simulation Scenarios

We focused on two different choices for the parameter vector $θ = (μ, σ, λ, α)$ of the WSLOG distribution, thus defining the following simulation scenarios:

Scenario A, $θ = (10, 2, - 0.3, 0.2)$ : this choice of $θ$ leads to a PDF with an asymmetric unimodal shape, with six inflection points. To generate pseudo-random numbers from the WSLOG distribution, we solved Equation (12) considering $m = 6$ and $s = 10$ , obtaining $c = 3.284$ .
Scenario B, $θ = (30, 3, - 3, 5)$ : this choice of $θ$ leads to a PDF with an asymmetric bimodal shape. In this case, we solved the Equation (12) considering $m = 18$ and $s = 5.4$ , obtaining $c = 2.0436$ .

The values of the parameters that give rise to Scenarios 1 and 2 were selected through a graphical inspection of the WSLOG PDF, so that the simulation aims to consider pseudo-random numbers that come from an asymmetric unimodal population (Scenario 1) and an asymmetric bimodal population (Scenario 2).

Figure 10 presents the WSLOG PDF curves with the envelope functions considered in both simulation scenarios.

3.2.3. Results

We generated 1000 pseudo-random samples from the WSLOG distribution for the scenarios described above, under the sample sizes n = 100, 150, 200, …, and 750, respectively. Subsequently, we obtained the ML estimates with their corresponding standard errors and squared errors.

Figure 11 and Figure 12 illustrate the behavior of the mean estimate (Mean), standard deviation (SD), root mean square error (RMSE), asymptotic standard error (SE), and probability of coverage (CP) of the 95% asymptotic confidence intervals that were obtained under the different sample sizes considered in the two simulation scenarios. In the figures, it can be seen that, as the sample size increases, the mean estimates get closer to the true values of the parameters and the SDs, SEs, and RMSEs decrease and tend to be close to each other, which suggests the asymptotic consistency of the estimators. In addition, we observe that the CPs approach nominal values as the sample size increases.

4. Data Analysis

Environmental data allow us to monitor the state of our natural environment, including air quality, water quality, biodiversity, temperature, among other factors. Continuous probability distributions have played an important role in the analysis of problems involving environmental variables. For example, Kassem et al. [17] investigated the wind characteristics and available wind energy for three urban regions in Northern Cyprus using the Weibull distribution; Rad et al. [18] proposed a mixture of multimodal skewed von Mises to model the wind direction in different geographic regions of interest under a Bayesian approach; Suleiman et al. [19] proposed the odd beta prime-logistic distribution to evaluate magnesium concentrations for groundwater quality.

This section illustrates the usefulness of the WSLOG distribution by fitting two real datasets associated with environmental variables.

4.1. Fitted Distributions

We compared the performance of the WSLOG distribution with that of the logistic mixture model and other bimodal distributions popular in the literature. Below is the PDF of each distribution considered:

Logistic mixture (MLOG) PDF [5]
$\begin{matrix} f (x; μ, σ, μ_{2}, σ_{2}, α) = \frac{α}{σ} \frac{e^{\frac{x - μ}{σ}}}{{(1 + e^{\frac{x - μ}{σ}})}^{2}} + \frac{1 - α}{σ_{2}} \frac{e^{\frac{x - μ_{2}}{σ_{2}}}}{{(1 + e^{\frac{x - μ_{2}}{σ_{2}}})}^{2}}, x \in R, \end{matrix}$
where $μ, μ_{2} \in R$ are the locations of the mixture components, $σ, σ_{2} > 0$ are the scales of the mixture components, and $α \in (0, 1)$ is the mixing parameter.
Skew bimodal-logistic (SBLOG) PDF [20]
$\begin{matrix} f (x; μ, σ, λ, α) = \frac{6}{σ^{3}} (\frac{σ^{2} + α {(x - μ)}^{2}}{3 + π^{2} α}) \frac{e^{\frac{x - μ}{σ}}}{{(1 + e^{\frac{x - μ}{σ}})}^{2}} \frac{e^{λ \frac{x - μ}{σ}}}{(1 + e^{λ \frac{x - μ}{σ}})}, x \in R, \end{matrix}$
where $μ \in R$ is a location parameter, $σ > 0$ is a scale parameter, $λ \in R$ , and $α > 0$ are shape parameters that control skewness and uni/bi-modality.
Skew flexible normal (SFN) PDF [9]
$\begin{matrix} f (x; μ, σ, λ, α) = \frac{c_{α}}{σ} ϕ (|\frac{x - μ}{σ}| + α) Φ (λ \frac{x - μ}{σ}), x \in R, \end{matrix}$
where $μ \in R$ is a location parameter, $σ > 0$ is a scale parameter, $α, λ \in R$ are shape parameters that control skewness and uni/bi-modality, $c_{α}^{- 1} = [1 - Φ (α)]$ is a normalizing constant, and $ϕ (\cdot)$ and $Φ (\cdot)$ are the PDF and CDF of the standard normal distribution, respectively.
Asymmetric bimodal power normal (ABPN) PDF [11]
$\begin{matrix} f (x; μ, σ, λ, α) = \frac{2 α c_{α}}{σ} ϕ (\frac{x - μ}{σ}) Φ {(|\frac{x - μ}{σ}|)}^{α - 1} Φ (λ \frac{x - μ}{σ}), x \in R, \end{matrix}$
where $μ \in R$ is a location parameter, $σ > 0$ is a scale parameter, $λ \in R$ and $α > 0$ are shape parameters that control skewness and uni/bi-modality, $c_{α} = 2^{α - 1} / (2^{α} - 1)$ is a normalizing constant, and $ϕ (\cdot)$ and $Φ (\cdot)$ are the PDF and CDF of the standard normal distribution, respectively.

The PDFs described above exhibit uni/bimodal asymmetric shapes depending on the values assumed by their shape parameters. Note, however, that the MLOG distribution has a larger parameter dimension, which translates into more shape flexibility for the PDF, but at the cost of investing more effort in parameter estimation. On the other hand, the SBLOG, SFN, and ABPN distributions have the same parameter dimension as the WSLOG distribution, so the WSLOG distribution can be considered a natural alternative for data modeling in scenarios where the SBLOG, SFN, and ABPN distributions are employed.

4.2. Fit Measurements

We used the Anderson–Darling (AD) and Cramér-von Mises (CvM) goodness-of-fit tests to assess the quality of fit of the WSLOG distribution. For this, we used the functions goftest::ad.test() and goftest::cvm.test() in the R programming language, see [21]. To assess comparative performance, we considered the Akaike Information Criterion (AIC) [22], the Corrected Akaike Information Criterion (CAIC) [23], and the Bayesian Information Criterion (BIC) [24]. These criteria are defined as $AIC = 2 k - 2 ℓ$ , $CAIC = k [\log (n) + 1] - 2 ℓ$ , $BIC = k \log (n) - 2 ℓ$ , respectively, where k is the number of parameters to be estimated, n is the sample size, and ℓ is the maximum log-likelihood value.

4.3. Unimodality Test

In order to achieve a deep understanding of the behavior of empirical frequency distributions, in addition to justifying the relevance of postulating the WSLOG distribution as a viable candidate for data modeling, we tested the hypothesis $H_{0}$ : the data have exactly one mode, versus the hypothesis $H_{1}$ : the data have at least two modes. For this, we consider the excess mass test using the function multimode::modetest() in the R programming language, see [25] and [26].

4.4. Data Description

Details of the analyzed datasets are provided below.

4.4.1. Geometric Features of Pollen Grains

The pollen grain is a fundamental structure in the reproduction of flowering plants. Its cellular and structural composition allows it to transport male genetic material to the female reproductive organs of the flower. Geometric characteristics of the pollen grain, such as its shape, size, and presence of surface structures, play an important role in studies on the growth of flowering plants. For this reason, we studied 481 observations on the measurement of a surface feature (ridge) along a direction in a certain type of pollen grain. These data are available on the website http://lib.stat.cmu.edu/datasets/pollen.data, accessed on 20 March 2024. Specifically, the data correspond to the observations of the variable “Ridge” in the data file POLLEN1.DAT.

Computing some descriptive measures, we saw that the minimum and maximum observations were −19.4687 and 21.4066, respectively, and that Fisher’s skewness coefficient was 0.0246, a smooth skewness level. In the excess mass test, we obtained an observed statistic of value 0.219, with a corresponding p-value equal to 0.164, which led us to conclude (at a significance level of 5%) that the frequency distribution of the data was unimodal. Taking these results into account, we explored the performance of the WSLOG distribution in fitting this dataset.

4.4.2. Temperature of Administrative Regions with Prevalence of Dengue

Dengue is a viral disease transmitted mainly by mosquitoes of the Aedes genus, endemic in many tropical and subtropical areas of the world. Ambient temperature plays a crucial role in the spread and incidence of dengue, since it affects both the vector mosquitoes and the virus; high temperatures positively influence the rate of development and reproduction of mosquitoes and favor the replication of the virus in them. We analyzed a dataset containing 1998 observations on the average temperature in administrative areas where occurrences of dengue virus have been documented. For more information regarding this dataset, please consult [27], or access it within the “dengue” database under the identifier “temp“ in the R software, version 4.3.1, as indicated in [28].

In the excess mass test, the observed statistic was equal to 0.048, with a p-value less than 0.01, which allowed us to conclude with a significance level of 1% that the temperature data are at least bimodal. Adding to the above the fact that the Fisher skewness coefficient was −0.728, a negative skewness level, we believe that the WSLOG distribution is a viable alternative for fitting the temperature data.

4.5. Results

Table 1 reports the ML estimates of the parameters of the distributions fitted to the data described in Section 4.4. In the tables, it can be seen that the WSLOG distribution presents the lowest values of AIC, CAIC, and BIC, which suggests that this distribution should be selected to fit both datasets.

Table 2 shows the results of the AD and CvM goodness-of-fit tests for the WSLOG distribution fitted to the data. From the table, it can be concluded (with a significance level of 1%) that the data come from a WSLOG population with parameters specified in Table 1. Figure 13 shows the histograms of the datasets along with the fitted PDFs. Here, it is evident that the observed relative frequencies closely align with the density values of the WSLOG distribution.

The datasets and results presented in this section can be accessed at https://github.com/isaaccortes1989/WSLOG-ARTICLE, accessed on 20 March 2024.

5. Final Comments

We have proposed an extension of the skew-logistic distribution capable of presenting asymmetric shapes with up to three modes for its PDF. The new distribution corresponds to a weighted version of the skew-logistic distribution, defined by a weight function that induces an extra shape parameter. Thus, the new distribution, which we call the weighted skew-logistic (WSLOG) distribution, has two shape parameters that lead to a more flexible PDF than the skew-logistic PDF. This flexibility allows for overcoming the limitation of the use of the skew-logistic distribution to unimodal data, being the WSLOG distribution capable of modeling data that exhibit bimodality and even trimodality.

We described some possible application scenarios of the WSLOG distribution, of which we highlight its possible use in the modeling of bimodal data associated with environmental variables. Also, we derived fundamental properties such as the cumulative distribution function and raw moments. In addition, we described the behavior of the skewness of the distribution and determine the approximate regions of uni/bi and trimodality in the two-dimensional parameter space associated with the shape parameters of the distribution. Furthermore, we observed that the shape parameter inherited from the skew-logistic distribution had an important effect on the skewness of the distribution, while the shape parameter induced by the weight function had an important effect on the number of modes that the distribution presents.

We explored the issue of parameter estimation for the WSLOG distribution using the maximum likelihood approach. We carried out a simulation study to empirically evaluate the performance of the estimators. Overall, we found that the maximum likelihood method yields were satisfactory.

Finally, we presented two applications for real data associated with environmental variables; ambient temperature in administrative regions in which dengue virus cases have been recorded, and the measurement of a surface characteristic of a certain type of pollen grain. The performance of the WSLOG distribution was compared to that of some bimodal skewed distributions in the literature, including a logistic mixture model. The results suggest that the WSLOG distribution performs better in fitting both data sets, illustrating the utility of the new distribution in modeling environmental data.

Author Contributions

Conceptualization, J.R.; methodology, I.C., J.R. and Y.A.I.; software, I.C. and Y.A.I.; validation, I.C., J.R. and Y.A.I.; formal analysis, I.C., J.R. and Y.A.I.; investigation, I.C., J.R. and Y.A.I.; writing—original draft preparation, I.C., J.R. and Y.A.I.; writing—review and editing, Y.A.I. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/isaaccortes1989/WSLOG-ARTICLE/blob/main/Application%201.R, accessed on 20 March 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

WSLOG	Weighted skew-logistic
MLOG	Mixture logistic
SBLOG	skew bimodal-logistic
SFN	Skew flexible normal
ABPN	Asymmetric bimodal power normal
AIC	Akaike information criterion
CAIC	Corrected Akaike information criterion
BIC	Bayesian information criterion

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

View Image - Figure 1. PDF curves of the WSLOG distribution with [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (solid line), [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (dashed line), and [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (dotted line).

Figure 1. PDF curves of the WSLOG distribution with [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (solid line), [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (dashed line), and [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] (dotted line).

View Image - Figure 2. Plot of the HRF of the WSLOG distribution with [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] in the left panel and [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] in the right panel.

Figure 2. Plot of the HRF of the WSLOG distribution with [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] in the left panel and [Forumla omitted. See PDF.], [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] in the right panel.

View Image - Figure 3. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.].

Figure 3. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.].

View Image - Figure 4. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

Figure 4. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

View Image - Figure 5. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

Figure 5. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

View Image - Figure 6. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

Figure 6. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

View Image - Figure 7. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

Figure 7. Profiles of the critical points and inflection points equations and the WSLOG PDF curve for [Forumla omitted. See PDF.] y [Forumla omitted. See PDF.].

View Image - Figure 8. Shape description for WSLOG PDF with different values of [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.]; unimodal (⋄), bimodal (·) and trimodal (+).

Figure 8. Shape description for WSLOG PDF with different values of [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.]; unimodal (⋄), bimodal (·) and trimodal (+).

View Image - Figure 9. Left panel: A 3-D perspective of Fisher’s skewness coefficient of the WSLOG distribution. Right panel: A 2-D perspective of Fisher’s skewness coefficient of the WSLOG distribution with [Forumla omitted. See PDF.] (solid line), [Forumla omitted. See PDF.] (black dashed line), [Forumla omitted. See PDF.] (black dotted line), [Forumla omitted. See PDF.] (grey dashed line), and [Forumla omitted. See PDF.] (grey dotted line).

Figure 9. Left panel: A 3-D perspective of Fisher’s skewness coefficient of the WSLOG distribution. Right panel: A 2-D perspective of Fisher’s skewness coefficient of the WSLOG distribution with [Forumla omitted. See PDF.] (solid line), [Forumla omitted. See PDF.] (black dashed line), [Forumla omitted. See PDF.] (black dotted line), [Forumla omitted. See PDF.] (grey dashed line), and [Forumla omitted. See PDF.] (grey dotted line).

Figure 10. The WSLOG PDFs and envelope functions considered in scenarios A and B.

Figure 11. Mean, SD, SE, RMSE, and CP obtained in scenario A under the different sample sizes considered.

Figure 12. Mean, SD, SE, RMSE, and CP obtained in scenario B under the different sample sizes considered.

Figure 13. Histogram of the data described in Section 4.4 and the fitted PDFs.

Table 1

ML estimates for the parameters of the distributions fitted to the data described in Section 4.4.

Ridge Variable
Estimate	WSLOG	MLOG	SBLOG	SFN	ABPN
$\hat{μ}$	−1.458 (0.396)	−1.207 (1.050)	2.612 (0.796)	2.464 (0.725)	−1.294 (0.156)
$\hat{σ}$	1.558 (0.065)	3.231 (0.307)	2.241 (0.150)	6.176 (0.543)	7.528 (0.831)
${\hat{μ}}_{2}$	-	7.279 (1.702)	-	-	-
${\hat{σ}}_{2}$	-	1.996 (0.801)	-	-	-
$\hat{λ}$	0.182 (0.067)	-	−0.264 (0.107)	−0.417 (0.158)	0.353 (0.088)
$\hat{α}$	0.255 (0.034)	0.831 (0.135)	0.376 (0.092)	−0.214 (0.189)	0.363 (0.270)
AIC	3141.8	3151.9	3148.4	3150.8	3149.5
CAIC	3162.5	3177.8	3169.1	3171.5	3170.2
BIC	3158.5	3172.8	3165.1	3167.5	3166.2
Temp Variable
$\hat{μ}$	27.604 (0.082)	12.529 (0.395)	15.736 (0.206)	28.177 (0.092)	28.120 (0.093)
$\hat{σ}$	2.746 (0.046)	3.842 (0.141)	2.553 (0.040)	16.202 (1.081)	14.010 (0.375)
${\hat{μ}}_{2}$	-	25.230 (0.131)	-	-	-
${\hat{σ}}_{2}$	-	1.248 (0.071)	-	-	-
$\hat{λ}$	−5.948 (0.474)	-	0.202 (0.022)	−22.942 (2.461)	−19.224 (1.856)
$\hat{α}$	0.119 (0.007)	0.530 (0.021)	1.671 (0.187)	0.644 (0.178)	$6.967 \times 10^{- 4}$ (0.189)
AIC	13,031.1	13,182.3	13,650.2	13,220.1	13,205.2
CAIC	13,057.5	13,215.3	13,676.6	13,246.5	13,231.6
BIC	13,053.5	13,210.3	13,672.6	13,242.5	13,227.6

Table 2

Goodness-of-fit tests for the WSLOG distribution fitted to the data described in Section 4.4.

Data	AD Test		CvM Test
	Statistic	$p$ -Value	Statistic	$p$ -Value
Ridge variable	3.578	0.272	0.703	0.228
Temp variable	3.076	0.683	0.530	0.778

References

1. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat.; 1985; 12, pp. 171-178.

2. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions; John Wiley & Sons: Hoboken, NJ, USA, 1995; Volume 2.

3. Nadarajah, S. The skew logistic distribution. AStA Adv. Stat. Anal.; 2009; 93, pp. 187-203. [DOI: https://dx.doi.org/10.1007/s10182-009-0105-6]

4. Gupta, R.D.; Kundu, D. Generalized logistic distributions. J. Appl. Stat. Sci.; 2010; 18, 51.

5. Rost, J. Logistic Mixture Models; Springer: New York, NY, USA, 1997; pp. 449-463.

6. Aitkin, M.; Rubin, D.B. Estimation and hypothesis testing in finite mixture models. J. R. Stat. Soc. Ser. B; 1985; 47, pp. 67-75. [DOI: https://dx.doi.org/10.1111/j.2517-6161.1985.tb01331.x]

7. McLachlan, G.J.; Lee, S.X.; Rathnayake, S.I. Finite mixture models. Annu. Rev. Stat. Its Appl.; 2019; 6, pp. 355-378. [DOI: https://dx.doi.org/10.1146/annurev-statistics-031017-100325]

8. Elal-Olivero, D. Alpha-skew-normal distribution. Proyecciones (Antofagasta); 2010; 29, pp. 224-240. [DOI: https://dx.doi.org/10.4067/S0716-09172010000300006]

9. Gómez, H.W.; Elal-Olivero, D.; Salinas, H.S.; Bolfarine, H. Bimodal extension based on the skew-normal distribution with application to pollen data. Environmetrics; 2011; 22, pp. 50-62. [DOI: https://dx.doi.org/10.1002/env.1026]

10. Venegas, O.; Salinas, H.S.; Gallardo, D.I.; Bolfarine, H.; Gómez, H.W. Bimodality based on the generalized skew-normal distribution. J. Stat. Comput. Simul.; 2018; 88, pp. 156-181. [DOI: https://dx.doi.org/10.1080/00949655.2017.1381698]

11. Bolfarine, H.; Martínez-Flórez, G.; Salinas, H.S. Bimodal symmetric-asymmetric power-normal families. Commun.-Stat.-Theory Methods; 2018; 47, pp. 259-276. [DOI: https://dx.doi.org/10.1080/03610926.2013.765475]

12. Fisher, R.A. The effect of methods of ascertainment upon the estimation of frequencies. Ann. Eugen.; 1934; 6, pp. 13-25. [DOI: https://dx.doi.org/10.1111/j.1469-1809.1934.tb02105.x]

13. Rao, C.R. On discrete distributions arising out of methods of ascertainment. Sankhyā Indian J. Stat. Ser. A; 1965; 27, pp. 311-324.

14. Carta, J.A.; Ramírez, P. Use of finite mixture distribution models in the analysis of wind energy in the Canarian Archipelago. Energy Convers. Manag.; 2007; 48, pp. 281-291. [DOI: https://dx.doi.org/10.1016/j.enconman.2006.04.004]

15. Zhao, Z.; Zhang, X. Evaluation of methods to detect and quantify the bimodal precipitation over Central America and Mexico. Int. J. Climatol.; 2021; 41, pp. E897-E911. [DOI: https://dx.doi.org/10.1002/joc.6736]

16. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021.

17. Kassem, Y.; Al Zoubi, R.; Gökçekuş, H. The possibility of generating electricity using small-scale wind turbines and solar photovoltaic systems for households in Northern Cyprus: A comparative study. Environments; 2019; 6, 47. [DOI: https://dx.doi.org/10.3390/environments6040047]

18. Rad, N.N.; Bekker, A.; Arashi, M. Enhancing wind direction prediction of South Africa wind energy hotspots with Bayesian mixture modeling. Sci. Rep.; 2022; 12, 11442. [DOI: https://dx.doi.org/10.1038/s41598-022-14383-8] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35794177]

19. Suleiman, A.A.; Daud, H.; Singh, N.S.S.; Othman, M.; Ishaq, A.I.; Sokkalingam, R. A novel odd beta prime-logistic distribution: Desirable mathematical properties and applications to engineering and environmental data. Sustainability; 2023; 15, 10239. [DOI: https://dx.doi.org/10.3390/su151310239]

20. Cortés, I.E.; Venegas, O.; Gómez, H.W. A Symmetric/Asymmetric Bimodal Extension Based on the Logistic Distribution: Properties, Simulation and Applications. Mathematics; 2022; 10, 1968. [DOI: https://dx.doi.org/10.3390/math10121968]

21. Faraway, J.; Marsaglia, G.; Marsaglia, J.; Baddeley, A. R Package Version 1.2-3; goftest: Classical Goodness-of-Fit Tests for Univariate Distributions R Foundation for Statistical Computing: Vienna, Austria, 2021.

22. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control; 1974; 19, pp. 716-723. [DOI: https://dx.doi.org/10.1109/TAC.1974.1100705]

23. Bozdogan, H. Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika; 1987; 52, pp. 345-370. [DOI: https://dx.doi.org/10.1007/BF02294361]

24. Schwarz, G. Estimating the dimension of a model. Ann. Stat.; 1978; 6, pp. 461-464. [DOI: https://dx.doi.org/10.1214/aos/1176344136]

25. Ameijeiras-Alonso, J.; Crujeiras, R.M.; Rodríguez-Casal, A. Mode testing, critical bandwidth and excess mass. Test; 2019; 28, pp. 900-919. [DOI: https://dx.doi.org/10.1007/s11749-018-0611-5]

26. Ameijeiras-Alonso, J.; Crujeiras, R.M.; Rodríguez-Casal, A. Multimode: An R package for mode assessment. arXiv; 2018; arXiv: 1803.00472[DOI: https://dx.doi.org/10.18637/jss.v097.i09]

27. Hales, S.; De Wet, N.; Maindonald, J.; Woodward, A. Potential effect of population and climate changes on global distribution of dengue fever: An empirical model. Lancet; 2002; 360, pp. 830-834. [DOI: https://dx.doi.org/10.1016/S0140-6736(02)09964-6] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/12243917]

28. Maindonald, J.H.; Braun, W.J. Data Analysis and Graphics Using R. An Example-Based Approach; 3rd ed. Cambridge University Press: Cambridge, MA, USA, 2011.

Word count: 6753

Show less

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Skewness and bimodality properties are frequently observed when analyzing environmental data such as wind speeds, precipitation levels, and ambient temperatures. As an alternative to modeling data exhibiting these properties, we propose a flexible extension of the skew-logistic distribution. The proposal corresponds to a weighted version of the skewed logistic distribution, defined by a parametric weight function that allows shapes with up to three modes for the resulting density. Parameter estimation via the maximum likelihood approach is discussed. Simulation experiments are carried out to evaluate the performance of the estimators. Applications to environmental data illustrating the utility of the proposal are presented.

Details

Title

A Weighted Skew-Logistic Distribution with Applications to Environmental Data

Author

Cortés, Isaac¹

; Reyes, Jimmy²

; Iriarte, Yuri A²

¹ Facultad de Ciencias Básicas, Universidad Arturo Prat, Avenida Arturo Prat 2120, Iquique 1110939, Chile; [email protected]
² Departamento de Estadística y Ciencia de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile; [email protected]

First page

1287

Publication year

2024

Publication date

2024

Publisher

MDPI AG

e-ISSN

22277390

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/math12091287

ProQuest document ID

3053202470

A Weighted Skew-Logistic Distribution with Applications to Environmental Data

Jump to:

Full text

Abstract

Details

Suggested sources