1. Introduction
For a continuous random variable with density f, the classical differential (Shannon) entropy is defined by
(1)
Maximizing (1) with respect to f under the constraint of observed power or L-moments gives maximum entropy (ME) densities (see e.g., [1,2]). This ME solution represents a distributional model which is compatible with the minimum information given by the fixed constraints. The task of deriving ME densities is important, as they are the only reasonable distribution to use for estimation, as lower entropy distributions would mean to assume information that we do not possess. However, (1) has some shortcomings in order to be a good measure of information, as it could, for example be negative for special densities [3]. Nevertheless, much of the literature has been concerned with the ME task, not only for classical differential entropy, but also for the cumulative residual, the cumulative, cumulative paired entropy, and further entropies with modifications of the generating functions. Thus, in the following section, a short literature review is given.
Substituting the density by the survival function in (1) leads to the cumulative residual (Shannon) entropy. Rao et al. [4] were the first who discussed this new entropy. The discussion occurred in the context of reliability theory, where only random variables with non-negative support are of importance while the survival function is the natural distributional concept. The solution of the maximum entropy task under power moment constraints has already been discussed by [4,5]. They used the log-sum inequality to derive the ME solution instead of the usual approach based on the Lagrange–Euler equations. The exponential and more generally, the Weibull distribution, are solutions to these special ME tasks. In the following years, several authors [6,7,8,9,10,11,12,13,14,15,16,17] also focused on cumulative residual entropies. In particular, Drissi et al. [10] were concerned with the ME problem. They considered random variables with support and derived the logistic distribution as the ME solution under the additional constraint that the ME solution has to be symmetric. DiCrescenzo [18] applied (1) to the distribution function and called the result ‘cumulative entropy’. Based on early results of DeLuca and Pal [19,20] in the fuzzy set theory concerning membership functions, Li et al. [21] defined a further entropy concept for so-called uncertainty variables which are similar but not identical to random variables. The main idea here is to consider, in (1), the distribution function as well as the survival function. The obvious corresponding ME task was discussed by [22,23,24]. There are many studies in the literature concerned with the generalization of (1). Thus, some authors [25,26,27] modified the entropy generating function. General generating functions were considered by [28,29,30] for the definition of the related concept of f-divergence [31]. Sometimes f-divergences are also called -divergences [32]. Zografos [33] generalized the cumulative residual Shannon entropy in a similar way. Klein et al. [34] combined the ME task known from uncertainty theory with the use of general entropy generating functions. They derived the Tukey distribution as an ME distribution when the entropy generating function of [25] is applied together with the distribution and the survival function. They introduced the term ‘cumulative paired entropy’ analogue to the paired entropy introduced by [35]. Recent publications [36,37] applied the Havrda-Charvát approach to the survival function under the name ‘Cumulative Tsallis entropy’ of order . This term refers to the famous paper of [38], where he gives a physical foundation for the Havrda-Charvát approach.
In this paper, the first research question was to clarify what kind of information cumulative entropies really measure. Therefore, we introduce the concept of ‘contradictory information’ in contrast to ‘no information’. As a second research question, we want to unify the diverse approaches of cumulative entropies and their maximization. For this purpose, general cumulative entropies will be introduced. All known variants of cumulative entropies are special cases of this general class. Then, after deriving two general formulas for ME quantile functions under some moment restrictions, we apply these formulas to derive ME distributions for new cumulative entropies (like the cumulative Mielke(r) entropy) as well as to identify the cumulative entropy for some flexible families of distributions that allow for skewness (like the generalized Tukey or the generalized logistic distribution). As a byproduct, we find some new families of distributions (like a special skewed Tukey distribution and a generalized Weibull distribution). The results are summarized in a table and discussed in detail in the Appendix A. The last research question starts with observed data and tries to estimate the cumulative entropy in such a way that the data come from the corresponding ME distribution. This gives an alternative to non-parametric estimation of density functions or distribution functions.
This paper is organized in line with these research questions. Section 2 starts with the discussion of contradictory information and cumulative entropies in principle. In Section 3, we introduce general cumulative entropies and prove general results for ME distributions for cumulative entropies under different constraints. In Section 4, we propose an estimator for the ME generating function. Finally, we apply this estimator to real datasets. In the Appendix A, we apply the theoretical results to seven families of cumulative entropies (MaxEnt) or families of distributions (GMaxEnt).
2. What does Maximizing Cumulative Direct, Residual, and Paired Shannon Entropies Mean?
In this section, we first discuss the concept of ‘contradictory information’ in contrast to no or minimum information and determine that contradictory information corresponds with U-shaped/bipolar distributions. Then, we learn that maximizing cumulative paired entropies best reflects this situation by comparing the results with those of maximizing differential, cumulative residual, and cumulative direct entropies. Next, we see that the cumulative residual and cumulative direct entropies do not correspond to a U-shaped distribution if the support of a random variable is only non-negative. Overall, in this section, the focus is on Shannon entropies. However, all insights can be transferred to arbitrary cumulative entropies immediately.
The traditional ME approach starts with the result that the uniform distribution has minimum information (= maximum entropy) under the constraint that the area under the density sums up to one. However, there is another concept of maximum entropy in fuzzy set [19] and uncertainty theory [24,39]. Transferring this concept to probability theory, maximum uncertainty represents the fact that an event A with probability and its complementary event with probability have identical probability. This means that . Since the Shannon entropy
is maximized for , this kind of entropy could serve as the basis for an uncertainty measure. For a continuous random variable X, the ensemble of events , such that can be considered. It is obvious to measure the amount of uncertainty of X by with integration area . We set . Let F be the cumulative distribution function of X, then the cumulative paired Shannon entropy is defined by(2)
with probability integral transformation , quantile function , and quantile density for . f denotes the density of X. If X has a compact support , attains its maximum for for . This corresponds to a so-called bipolar distribution with . For this bipolar distribution, , holds. Therefore, the cumulative paired Shannon entropy increases with . In contrast to this, the classical differential Shannon entropy () takes a value of for all bipolar distributions, independently of how large the distance between the two mass points is. Rao [5] identified this property as an important advantage of cumulative entropies over the differential entropy. The different behaviors of differential (Shannon) entropy and cumulative Shannon entropy are illustrated in Example 1.We consider the symmetric beta distribution with density
while parameter . This range allows almost bipolar distributions ( ), uniform distributions ( ), and bell-shaped distributions ( ). Figure 1 compares the values of the differential entropy and the cumulative paired Shannon entropy for this range of the parameter α. We see that the differential entropy is non-positive everywhere and attains its maximum for the uniform distribution ( ). In contrast to this, the cumulative paired Shannon entropy starts with the maximum value for a bipolar distribution and decreases monotonically with an increase of the parameter α.
As we do not want to assume information that we do not possess, we perform the ME task to search for densities that are relying on maximum entropy. The densities that are based on minimum information are the only ones that could reasonably be used. In this paper, we propose to rely on bipolar distributions, as they provide contradictory information which is even less useful than minimum or no information for prediction.
The following examples intend to explain where bipolar distributions appear in real situations and how this bipolarity affects the predictability of a random variable X.
In an opinion poll survey, individuals are asked to judge their political belief on a continuous left–right scale. 0 (100) symbolizes an extremely left (right) political view. The survey’s result maximizes the cumulative paired Shannon entropy if half of the people state to be extremely left , and the other half state to be extremely right . This is a situation of maximum uncertainty regarding how to predict the political view of an individual person.
If the task is to judge a product on a Likert scale with five ordered categories, the uniform distribution means that no category will be favored by the majority of the voters. However, there could be a result of the voting that is still more confusing than the uniform distribution. What can we learn from the extreme situation that half of the voters give their vote to the best and the other half to the worst category? What does this mean for a new customer thinking over buying the product? In this situation, buying would therefore mean receiving an either excellent or very bad product. This is a situation in which it is most complicated to predict the customer’s decision.
Both situations of Examples 2 and 3 can be characterized by the term ‘contradictory information’ in contrast to minimum or no information. In general, information is able to reduce uncertainty. However, contradictory information is implicitly defined by the fact that it increases uncertainty and provides a high chance for a wrong decision. (Anti-information is a related, but less formal concept introduced by the information scientist J. Verhoeff [40].) Therefore, as bipolar distributions lead to contradictory information, it is an important task to consider entropies that will be maximized by a bipolar distribution if there are no constraints. Thus, we propose to use cumulative paired entropies to cover contradictory information. Example 1 already showed that the differential entropy does not embody contradictory information. In the following section, we compare the information provided by using cumulative residual and cumulative direct entropies in contrast to using cumulative paired entropies. Rao et al. [4] introduced cumulative residual entropies as a Shannon entropy, where the density is substituted by the survival function. Then, [5,6,7,8,9,10,11,12,13,14,15,16,17] discussed this cumulative residual Shannon entropy:
(3)
DiCrescenzo and Longobardi [41] applied the Shannon entropy to a distribution function and called it cumulative entropy. This gave the formula(4)
We will call (4) cumulative direct Shannon entropy () for a better distinction to the cumulative residual Shannon entropy () and the cumulative paired Shannon entropy ().What does maximum entropy mean for the cases of cumulative residual and cumulative direct Shannon entropy? The entropy generating function attains its maximum for . If the support is , the maximum distribution is bipolar. However, this bipolarity is less extreme than in the symmetric case. This is due to that fact that it holds and . Therefore, there is a preference for the alternative a that makes the prediction of X easier than in the symmetric case. However, there is still somewhat contradictory information rather than information. Regarding (4), the probabilities for a and b have to be interchanged to get a maximum distribution. The following example illustrates this for a beta distribution with parameters and .
Let X be beta distributed with density
In the following section, we fix β and compute α such that (3) or (4) will be maximized. In Table 1 and denote the corresponding maximum values. Moreover, this table also contains the maximum values of and . We see that the maximum is attained for small values of α and β, denoting a slightly asymmetric U-shaped beta distribution.
Figure 2 illustrates the maximum and beta distributions for the parameter settings displayed in Table 1.
To date, the support has been . However, as, e.g., in the reliability theory, the focus is on random variables with only non-negative support. Thus, it is of importance to also discuss this situation. When only considering a random variable with non-negative support and for the ME quantile function Q holds , maximizing or gives a distribution which is no longer U-shaped, and the maximum entropy situation no longer corresponds with contradictory information. We illustrate this in Example 5 using a special beta distribution. The parameter will be set to 1 such that .
Let X be beta distributed with density
Table 2 displays the values for and for certain values of α. For , we get an extremely right skewed and for an extremely left skewed distribution. For ( ), ( ) attains its maximum.
Figure 3 displays, on the top row, the maximum distribution at and shows that this is an arrangement between an extremely right skewed ( ) and an extremely left skewed ( ) distribution. On the bottom row, we see the maximum distribution at .
The question we raised in the section title on what maximizing cumulative direct, residual, and paired Shannon entropies means can be answered by the conclusion that maximizing these entropies leads to a more or less skewed U-shaped distribution as long as there are no special constraints (like ) which are able to prevent this. This U-shaped distribution corresponds to contradictory information. Examples 2 and 3 showed that this kind of information is even less useful for prediction and estimation than minimum or no information. Therefore, those distributions are the only reasonable distributions to consider to not assume information that we do not possess.
In the following section, we unify the diverse approaches of cumulative entropies and introduce the general class of cumulative entropies. Then, we derive two general formulas for ME quantile functions under some restrictions.
3. Maximum Cumulative Entropy Distributions
In the following, we will first introduce the general class of cumulative entropies that incorporates and generalizes well-known entropies. Then, we will derive general formulas for maximum entropy distributions for this new class regarding arbitrary support as well as non-negative support.
3.1. General Class of Cumulative Entropies
In this section section, we incorporate cumulative direct, residual, and paired entropies into one approach. Additionally, instead of focusing on the Shannon case, we allow for a general so-called entropy generating function , which has to be non-negative and concave on . In general, but not mandatory has a maximum in the interval . Hence, the corresponding cumulative entropies are the cumulative paired entropy
(5)
the cumulative residual entropy(6)
and the cumulative direct entropy with(7)
To cover all three cases into one approach, we consider a general concave entropy generating function such that or or , . Then,
will be called cumulative entropy. For the maximum entropy task, the objective is now to maximize this cumulative entropy with respect to F under distinct constraints and to search for the distribution that maximizes this entropy. At first, we consider cumulative entropies in a situation with fixed mean and variance. The restriction to these two moments can be explained by the fact that higher moments lead to equations for the ME quantile function which cannot be solved explicitly or the solution does not exist. Then, we discuss the same task with the additional requirement that . This leads to the fact that the solution can only exist for special relations between the fixed mean and the fixed k-th power moment.In Section 3.2, we consider the situation where mean and variance are fixed and in Section 3.3, the situation with the additional requirement of .
3.2. General Results for Arbitrary Support
The focus in this section is on the situation where mean and variance are fixed and the support is arbitrary. First, the maximum cumulative entropy principle and then the generalized maximum cumulative entropy principle are introduced. General formulas for ME quantile functions are provided.
3.2.1. Maximum Cumulative Entropy Approach
In this section, for a given entropy and fixed constraints, general formulas for ME distributions are derived. This maximum cumulative entropy approach follows the maximum entropy principle in the sense of [42,43]. The following theorem provides a general formula for the ME quantile function Q.
Let be the cumulative Φ entropy with concave entropy generating function Φ such that the derivative exists i.e., is quadratic integrable over , and , hold. Then, the maximum distribution under the constraints of fixed mean μ and variance is given by the quantile function
(8)
The objective function
has to be maximized under the restrictions of fixed with respect to the quantile function Q and the quantile density q. This leads to the Lagrange function with and denoting the Lagrange parameters. The Euler–Lagrange equation gives Solving this equation leads to the quantile function and are determined by the moments and . Rearranging leads to and(9)
From Solving with respect to leads to Inserting into (9) gives the quantile function (8). □3.2.2. Generalized Maximum Cumulative Entropy Approach
In this section, for a given quantile function Q, the corresponding generating function of the cumulative entropy will be derived. This generalized maximum cumulative entropy approach follows the generalized maximum entropy principle formulated by [44]. We also use formula (8) for this approach. For a simpler notation, we introduce the partial mean function. Let X be the random variable corresponding to Q and f be the density of X. Thus, the partial mean function is given by
(10)
Obviously, and hold.The following corollary states that the negative of the partial mean function determines the entropy generating function such hat Q is the ME quantile function under the constraints of given mean and variance .
Let Q be a quantile function. The entropy generating function Φ, such that Q is ME under the constraints of given mean and variance, is given by , .
Setting , gives
(11)
□Hence, is the conditional mean of X given for . It holds and such that and . The partial mean function therefore has a special role. As sums up the values x of X weighted with the density until the u-quantile of X, this addition gives constant values until the median quantile for an extremely U-shaped distribution. Thereafter, the value will be changed one time and stays again constant. Thus, the heavier the tails of a distribution, the steeper the entropy-generating function at and . This leads to a large value for the derivative at and . If the support is , then and . In line with the generalized maximum entropy principle, we will use (11) to derive such that a given distribution has the ME property under the constraints of fixed mean and variance. In Section 5, based on (11), we will propose an estimator for .
3.3. General Results for Non-Negative Support
To date, in the literature, the ME task was mainly considered for lifetime distributions with the special property that the support is . Therefore, in this section, the focus is on the situation where next to the constraints of fixed mean and variance also the support is restricted to . Similar to Section 3.2, the maximum cumulative entropy principle and then the generalized maximum cumulative entropy principle will be introduced in this situation and general formulas for ME quantile functions will be provided.
3.3.1. Maximum Cumulative Entropy Approach
In this section, for given entropy and constraints, a general formula for ME distributions will be derived, following the maximum cumulative entropy approach, while the support of the ME distribution is , which means that holds for the ME quantile function Q. From this fact, we get an additional constraint for the ME task. As further constraints, we consider a fixed mean and a fixed k-th power moment , . The following theorem shows how to derive the ME quantile function under these three constraints. For an ME solution to be existent, it requires a special relationship between the fixed moments and .
Let Φ a be concave function on with derivative such that is monotonically increasing. Then, the ME quantile function under the constraints of given mean and k-th power moment is
(12)
if(13)
Otherwise there is no solution of the ME task.Due to the Euler–Lagrange equation, it is
The constraint leads to and can be derived from as Inserting into gives (12) immediately.There is the third constraint
Dividing on both sides gives (13). □In the most popular case, mean and variance are fixed. This means and
and3.3.2. Generalized Maximum Cumulative Entropy Approach
The generalized maximum cumulative entropy approach for random variables with non-negative support and remains to be discussed. We start with the knowledge of the quantile function Q to derive the corresponding generating function of the cumulative entropy such that Q is the ME quantile function for under the constraints and fixed mean and fixed k-th power moment . Therefore, we introduce a special partial mean function. denotes the partial -th power mean function with
for . This partial -th power moment function is an important part of the entropy generating function as the following corollary shows.Let Q be a quantile function. The entropy generating function Φ, such that Q is ME under the constraints , fixed mean and fixed variance, is given by
Let X be the random variable corresponding to Q and f be the density of X. From
we get It is easy to verify that . Under the assumption it is and(14)
□In Section 5, we use (14) to estimate from a data set such that the data are generated by the corresponding ME distribution under the constraints of and fixed mean and fixed k-th power moment.
4. Applications
In this section, we give an overview about some ME distributions for cumulative entropies applying the results of Section 3. For some choices of , the problem of the ME task has already been solved. In the following section, we consider further choices of with a focus on those that lead to well known distributions. With the ME principle, it is no problem to generate completely new distributions, but this will not be the objective of this paper.
Table 3 displays an overview of several entropy generating functions and the corresponding ME distributions. The table is divided by the situation where mean and variance are fixed, by the distinction of the MaxEnt and the GMaxEnt task, and by the situation with the additional requirement of . Moreover, while cases no. 1 to no. 4 require symmetry of the ME distribution, cases no. 5 to no. 12 allow for skewness of the ME distribution. and denote the density and the quantile function of the standard normal distribution. We try to assign well known terms to the cumulative entropies generated by the respective . For the solution of the GMaxEnt task (no. 9 and 10), such terms are not available. The second column refers to the Appendix where the cases are discussed in detail.
Some of the results presented in Table 3 are already known from the literature. These are the solutions of no. 1 [24], no. 2 [34], no. 3 [34], no. 4 [23,45] and no. 11 [5]. The remaining cases state new results which are discussed in the Appendix for all readers interested in flexible statistical distributions. The general finding can best be illustrated by solutions no. 1 and no. 2. The ME distributions are the logistic and the Tukey distribution. Solving the ME task for the classical differential and the Havrda-Charvát (or Tsallis) entropy given fixed mean and variance results in the normal and the t- or r-distribution [46,47]. The difference is easy to explain. The cumulative entropy pulls the ME distribution as much as possible (limited by the restrictions) towards a U-shaped distribution. This leads to distributions with heavier tails (logistic instead of normal, Tukey instead of t or r).
There are a lot of entropy-generating functions well-known from physics which could also be considered in the context of cumulative entropies. It is easy to show that the results of Theorem 1 and Theorem 2 can be applied to, e.g., the generating functions of the Rényi [27], the Kaniadakis [48], or the Hanel-Thurner entropy [49], to mention only a few. Another comment deals with the concept of skewness. Some families of distributions have natural parameters of skewness. If the members of these families have closed expressions for the quantile function, Corollary 1 can be applied directly to derive the function for the corresponding cumulative entropy (GMaxEnt task). This is the reason why we focus on the generalized Tukey distribution. It is worth noting that again, a kind of non-symmetric cumulative Havrda-Charvát entropy appears as solution (see no 9). Other families of skewed distributions will be defined by modifying a given symmetric distribution. The Fechner approach as well as the still more popular Azzalini approach proceed in such a way. The Fechner approach introduces skewness by splitting the scale parameter for the positive and the negative halves of the underlying symmetric distribution. This leads to a corresponding splitting of the quantile function. Corollary 1 can again easily be applied to solve the GMaxEnt task as long as the quantile function is available in a manageable form. The solution for the normal distribution is given by solution no. 10. For the more popular Azzalini approach [50,51], this is not the case. Therefore, we omit to discuss the GMaxEnt task for this concept to generate skewed distributions.
Table 3 only contains special choices of the entropy generating functions . The main question is how to know . The answer could be given by an axiomatic approach or empirically. The starting point for the axiomatic approach are fundamental requirements with a plausible and general accepted interpretation in the considered scientific discipline. Such axiomatizations are available for the differential and the Tsallis entropy. A recent publication on this topic is e.g., [52]. In the context of cumulative entropies, we can go back to approaches in the fuzzy set theory. In this theory, measures of indefiniteness will be axiomatized (see [19,20,53]). The axioms are directly applicable to cumulative entropies. (The discussion of alternative entropies, skewness and axiomatic approaches is based on valuable comments of two anonymous referees.) In the following section, we do not want to discuss the axiomatic approach further. Instead, in the next section, we will focus on how to estimate the entropy generating function .
5. Estimating the Entropy Generating Function
Can we learn something from data about the entropy generating function for which the data generating distribution is an ME distribution under the constraints of given mean and variance? The entropy generating function is given by the partial mean function
Therefore, we can estimate this partial mean function to get an estimator for .Let be identically and stochastically independent distributed random variables. denote the corresponding sequence of order statistics. For a fixed value such that we consider an estimator of the form
for the entropy generating function .We demonstrate the usefulness of this estimator by the following examples.
The data set consists of the S&P500 standardized daily logarithmic returns from 10-05-2012 to 10-04-2017. (The data are available from
We can see that by estimating the entropy generating function Φ by the partial mean function, the density of the S&P500 standardized daily logarithmic returns can be fitted quite well.
In the following example, we consider a situation with non-negative support. We know from (14) that for a non-negative random variable with and fixed mean and fixed k-th power moment the entropy generating function is given by
For this entropy generating function Q is an ME quantile function.To get an estimator for , it is only necessary to estimate the -th power mean and the partial -th power mean function . For a fixed value (such that ), a natural estimator for the partial -th power mean function is
An estimator for the entropy generating function is given byWe will show that this estimator works well for a real data set and the Weibull distribution. Therefore, we need the partial -th power mean function for the Weibull distribution with shape parameter r and scale parameter . For this distribution, it holds
for . denotes the distribution function of a distribution with shape parameter a and scale parameter . The corresponding entropy generating function such that this Weibull distribution is maximum under and the constraints of fixed mean and fixed k-the power moment is k determines the shape parameter r by the relationLet X be a random variable representing the duration in days between two explosions in the mines of a specific region. From [54], we get the following dataset with the duration between 41 mine explosions:
We set . This means that for every potential ME distribution, has to hold. This implies for the shape parameter r of the Weibull distribution. In Figure 5, the estimated entropy generating function is compared with for this Weibull distribution. The fit seems to be rather good in view of the relatively small sample size.
Further work will be conducted to estimate the number of degrees or parameters of other flexible distributions by minimizing the distance between the easy-to-calculate empirical entropy generating function, and the entropy-generating function of the distribution, we suppose the data could be generated from. The advantage of this procedure could be that the empirical entropy-generating function is rather smooth. Therefore, minimizing the distance between the entropy generating functions could be more accurate than considering the distance between the empirical quantile functions, a density estimator, or the empirical distribution functions and the corresponding theoretical counterpart. However, this will be investigated in future research.
6. Conclusions
To be able to estimate and predict while not using information that we do not possess, it is important to derive maximum entropy distributions. Maximizing Shannon’s differential entropy under different moment constraints is a well-known task. Without any constraints, the differential entropy will be maximized by a uniform distribution representing the situation of no information. However, an extremely bimodal (=bipolar) distribution represents a situation of so-called contradictory information since an event and its complement can happen with equal probability. In this situation, it is extremely hard to make a forecast, even harder than for a uniformly distributed random variable. Hence, this paper claims that contradictory information is even less useful than minimum or no information as it increases uncertainty and provides a high chance for a wrong decision. Such a bipolardistribution is covered by maximizing a cumulative entropy instead of the differential entropy without any constraints. Such a cumulative entropy depends either on the distribution function (direct), on the survival function (residual) or on both (paired). Under the constraints of fixed mean and variance, maximizing the cumulative entropy tries to transform a distribution in the direction of a bipolar distribution as far as it is allowed by the constraints. For so-called cumulative paired entropies and the constraints that mean and variance are known, solving the maximization problem leads to symmetric ME distributions like the logistic and the Tukey distribution [21,34]. So far, other ME distributions were found for the cumulative paired Leik and Gini entropy [23,34,45]. There are two different principles to derive maximum entropy distributions. The maximum entropy principle in the sense of [42,43] is the task to derive an ME distribution for a given entropy and fixed constraints. The generalized maximum entropy approach formulated by [44] uses a given ME distribution for which the corresponding generating function of the cumulative entropy will be derived. In this paper, we will applied both approaches for the cumulative entropy, which generalizes the cumulative paired entropy in several ways and thus introduced the maximum cumulative entropy approach and the generalized maximum cumulative entropy approach. Moreover, we regarded situations with different constraints. First, we considered a situation with arbitrary support and given mean and variance and second a situation with non-negative support and the additional constraint of for the ME quantile function. This was done, as in the literature the ME task was considered mainly for lifetime distributions with the special property that the support is and it holds . Under these additional constraints, we derived ME distributions for fixed mean and k-th power moment. For the situation with arbitrary support and given mean and variance, we introduced the cumulative paired Mielke(r) entropy and derived the ME distributions. The results already known for the cumulative paired Leik and Gini entropy are included for and . Then, starting with a natural generalization of the derivative of the entropy generating function known from the logistic distribution, we derived as ME distribution the generalized logistic distribution (GLO) immediately. Considering a linear combination of entropy generating functions led to new ME distributions with skewness properties. Here, we derived the skewed logistic distribution and the skewed Tukey distribution in line with [55]. Next, using the generalized maximum entropy approach, we derived an entropy generating function such that a pre-specified skewed distribution is an ME distribution. The generalized Tukey distribution served as an example. Now, we considered Fechner’s proposal to define different values of a scale parameter for both halves of a distribution for getting skewed distributions. Again, we derived the corresponding entropy generating function. The skewed normal distribution served as an illustrative example. Then, we focused on the situation where the support of the ME distribution is restricted to , while using the maximum cumulative entropy approach. Here, we derived as ME distribution for the cumulative residual Shannon entropy the Weibull distribution and for the cumulative residual Havrda-Charvát entropy the extended Burr XII distribution. Finally, we proposed an estimator for the cumulative entropy generating function representing all the properties of the underlying ME data generating distribution. This gives an alternative to non-parametric estimation of density functions or distribution functions. The usefulness of this estimator was demonstrated for two real data sets.
Author Contributions
I.K. conceived the new maximum entropy concepts, investigated its properties, applied it to several distributions and wrote an initial version of the manuscript. M.D. contributed by mathematical and linguistic revision. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Acknowledgments
The authors would like to thank Paul van Staden for the hint to Hosking’s work about the generalized logistic distribution and three anonymous reviewers for their constructive criticism, which helped to improve the presentation of this paper significantly.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
In Appendix A.1, Appendix A.2, Appendix A.3, Appendix A.4 and Appendix A.5 the situation will be considered where mean and variance are fixed and the support is arbitrary. Then, in Appendix A.6 and Appendix A.7 the support of the ME distribution has to be non-negative and the property will be required as an additional constraint. Furthermore, we will use the maximum cumulative entropy approach in Appendix A.1, Appendix A.2 and Appendix A.3 as well as in Appendix A.6 and Appendix A.7. In Appendix A.4 and Appendix A.5 we will use the generalized maximum cumulative entropy approach.
Figures and Tables
Maximum entropy beta distributions with several parameter values of β.
Max. | Max. | |||
---|---|---|---|---|
0.01 | 0.006 | 0.3678 | 0.017 | 0.3677 |
0.20 | 0.111 | 0.3543 | 0.290 | 0.3408 |
0.50 | 0.259 | 0.3222 | 0.595 | 0.2970 |
1.00 | 0.482 | 0.2778 | 1.000 | 0.2500 |
2.00 | 0.905 | 0.2226 | 1.000 | 0.1869 |
Cumulative residual and cumulative direct Shannon entropy for the beta distribution with several parameter values α and .
0.01 | 0.1836 | 0.0826 |
0.48 | 0.2779 | 0.2191 |
1.00 | 0.2500 | 0.2500 |
3.00 | 0.1464 | 0.1876 |
Entropy generating functions with corresponding maximum entropy distributions.
No. | App. | ME distr. | ||
---|---|---|---|---|
Fixed mean and variance, without | ||||
1 | Shannon | logistic | ||
2 | Havrda-Charvát | Tukey | ||
3 | Appendix A.1 | Leik | bimodal | |
4 | Appendix A.1 | Gini | uniform | |
5 | Appendix A.1 | Mielke | symm. beta | |
6 | Appendix A.2 | Havrda-Charvát | general. logistic | |
like | ||||
7 | Appendix A.3 | non-symm. | skewed logistic | |
Shannon | ||||
8 | Appendix A.3 | non-symm. | skewed Tukey | |
Havrda-Charvát | ||||
9 | Appendix A.4 | GMaxEnt | general. Tukey | |
10 | Appendix A.5 | GMaxEnt | skewed normal | |
Fixed mean and k-th moment, with | ||||
11 | Appendix A.6 | Shannon | Weibull | |
12 | Appendix A.7 | Havrda-Charvát | ext. Burr XII |
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2020 by the authors.
Abstract
A distribution that maximizes an entropy can be found by applying two different principles. On the one hand, Jaynes (1957a,b) formulated the maximum entropy principle (MaxEnt) as the search for a distribution maximizing a given entropy under some given constraints. On the other hand, Kapur (1994) and Kesavan and Kapur (1989) introduced the generalized maximum entropy principle (GMaxEnt) as the derivation of an entropy for which a given distribution has the maximum entropy property under some given constraints. In this paper, both principles were considered for cumulative entropies. Such entropies depend either on the distribution function (direct), on the survival function (residual) or on both (paired). We incorporate cumulative direct, residual, and paired entropies in one approach called cumulative
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Statistics and Econometrics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Lange Gasse 20, 90403 Nürnberg, Germany
2 GfK SE, Nordwestring 101, 90419 Nürnberg, Germany;