1. Introduction
Even when we receive information under the same conditions, our point of view may greatly differ from others’. Therefore, if we want to analyze expert knowledge, such differences should be considered. Figure 1 shows a representation of this problem. Our point of view on certain information depends on our cognitive skills and external factors that might change our beliefs.
Expert knowledge elicitation (EKE) has the goal of producing, via elicitation, a probabilistic distribution that represents the expert’s knowledge around a parameter of interest. For that purpose, we can adopt the Delphi method as an elicitation method. The latter is defined by Brown (1968) [1] as a technique based on the results of multiple rounds of questionnaires sent to a panel of experts and whose purpose is to reach a consensus on their opinion. Such method is effective, as it allows a group of individuals to address a complex problem and could be implemented to obtain a single representation of experts’ beliefs through a probability distribution. However, this method proves difficult when the number of experts in the study increases considerably.
Finding the mean of the level of certainty of each expert using their personal distributions is another way to obtain a prior distribution of expert knowledge. Nevertheless, it could be erroneous, as shown in Figure 2, where, for instance, we observe two hypothetical experts’ prior distributions (red and black curves) representing their level of knowledge over a proportion of , and the mean of the level of the two experts (green curve), which does not represent the actual level of certainty of each expert. As a result, the entire complex elicitation work done for each expert is wasted. Hence, we believe that the probability distributions of each expert can be classified and their opinions represented using clusters of beliefs. Thus, Bayesian inference can be carried out in parallel by considering each cluster of priors and a decision can be arrived at via experts’ criteria (Barrera-Causil et al., 2019) [2].
On the other hand, classifying probability distributions is an essential task in different areas. Clustering methods to classify word distribution histograms in information retrieval systems have been implemented successfully [3]. For instance, Henderson at al. (2015) [4] present three illustrations with airline route data, IP traffic data, and synthetic data sets to classify distributions. Therefore, functional data analysis (FDA) could be used for clustering distributions since it is an extension of the multivariate methods where observations are represented by curves in a function space [5]. Some tools of multivariate analysis have been extended to the functional context pointwise, considering the implementation of multivariate procedures around the real interval where these functions are defined. Thus, in many cases, the curves are discretized to implement statistical procedures.
Cluster analysis is one of multiple techniques that have been extended to FDA, and different methods are implemented to obtain partitions of curves within its framework. Some of those methods have been compared to determine their performance and make recommendations in several situations [6]. For instance, Abraham et al. [7] proposed a two-stage clustering procedure in which each observation is approximated by a B-spline in the first stage, and the functions are grouped using the k-means algorithm in the second stage. Gareth and Sugar (2003) [8] presented a model-based approach for clustering functional data. Their method was effective when the observations were sparse, irregularly spaced, or occurred at different time points for each subject. Serban and Wasserman (2005) [9] proposed a technique for nonparametrically estimating and clustering a large number of curves. In their method, the nearly flat curves are removed from the analysis, while the remaining curves are smoothed and finally grouped into clusters.
Other alternatives can also be found in the literature. For instance, Shubhankar and Bani (2006) [10] proposed a nonparametric Bayes wavelet model for clustering curves based on a mixture of Dirichlet processes. Song et al., (2007) [11] presented a FDA-based method to cluster time-dependent gene expression profiles. Chiou et al. (2007) [12] developed a Functional Clustering (FC) method (i.e., k-centres FC) for longitudinal data. Their approach accounts for both the means and the modes of variation differentials between clusters by predicting cluster membership with a reclassification step. Tarpey (2007) [13] applied the k-means algorithm for clustering curves under linear transformations of their regression coefficients. More recently, Goia et al., (2010) [14] used a functional clustering procedure to classify curves representing the maximum daily demand for heating measurements in a district heating system. Hébrail et al., (2010) [15] proposed an exploratory analysis algorithm for functional data. Their method involves finding k clusters in a set of functions and representing each cluster with a piecewise constant, seeking simplicity in the construction of the clusters. Boullé (2012) [16] presented a novel method to analyze and summarize a collection of curves based on a piecewise constant density estimation where the curves are partitioned into clusters. Furthermore, Secchi et al., (2012) [17] focused on the problem of clustering functional data indexed by the sites of a spatial finite lattice. Jacques and Preda (2013) [18] presented a model-based clustering algorithm for multivariate functional data based on multivariate functional principal components analysis. The references in Jacques and Preda (2013) [19] are of particular importance because they summarize the main contributions in the field of functional data clustering. Other clustering algorithms have been reported in the literature. For instance, Ferreira and Hitchcock (2009) [6] compared four hierarchical clustering algorithms on functional data: single linkage, complete linkage, average linkage, and Ward’s method (T these methods are implemented in the
Although there is research relating to the clustering of functions, no study has considered functional clustering of experts’ beliefs. For example, Stefan et al., (2021) [20] studied the effect of interpersonal variation in elicited prior distributions on Bayesian inference. In their study one of the six experts exhibited discrepant distributions. Thus, it would be ideal to have a method able to numerically address discrepancies among clusters of elicited prior distributions. Another important situation is when a researcher needs to make a decision based on information obtained from elicited priors. In all these cases, and according to the problem showed in Figure 2, differences between priors should be addressed and the estimation, either posterior or prior, must be done in parallel for each group of elicited priors. Thus, in this paper we propose a new method to deal with multiple elicited prior distributions. The method thus allows clustering distributions using FDA and the Hellinger’s distance (Simpson, 1987) [21]. Hellinger’s distance enables to quantify the similarity between two probability distributions and, we believe, it is a more appropriate metric than the current metrics for functional data. An illustration of the place of the proposed method within the expert knowledge elicitation workflow is shown in Figure 3.
This proposal is motivated by the interest of offering a new tool for the analysis of prior curves from multiple experts when elicitation is used. However, in addition to offering an alternative to the problem posed in Figure 2, or to the complexity involved in applying the Delphi method with a considerable number of experts, this proposal can be implemented to detect atypical curves, or even to create clusters in fuzzy multicriteria decision-making problems (Kahraman, Onar, and Oztaysi, 2015) [22].
To test the efficiency of the clustering algorithm our method, we propose a hierarchical clustering technique for functional data and compare it, via statistical simulation, with functional k-means, Ward’s, and average linkage methods (these methods are implemented in
The paper is structured as follows: Section 2 introduces some theoretical approaches and details the proposed method for clustering density functions and/or functional data. Section 3 describes the simulation study and its results. Section 4 shows an example using real data sets to illustrate the use of the proposed method. Section 5 presents the main contributions of this paper and suggests topics for further research. Finally, Section 6 contains the supplementary material and three additional illustrations using different data sets (see Appendix A).
2. Definitions
2.1. Elicitation of Probability Distributions
An elicitation process involves extracting information about parameters of interest () from the subjective experience of a person (expert) and expressing it as a probability distribution (prior distribution) Barrera-Causil et al., (2019) [2]. Thus, for expert i, a real function (prior distribution) is elicited, and for a grid of m equally spaced points () throughout the support of this distribution, the heights of the function are calculated . This process is made for n experts with the aim to discretize the functions .
2.2. Functional Data Analysis
In functional data, the ith observation is a real function , where , such that and are the minimum and maximum of for the i-th expert, respectively, and is a real interval . Thus, each is a point in certain function space (Ramsay et al. 2009) [29]. For analysis purposes, we assume that the functional data has an inner product or Hilbert space structure, that is, , where is a vector space of functions defined on a real interval and is complete and has an inner product .
Functional Clustering
There are different methods for clustering functional data, such as k-means, agnes with the agglomerative average method, and Ward’s method, among others (Ferreira and Hitchcock (2009) [6]). However, k-means is used most frequently.
Table 1 describes three different methods implemented in this paper.
2.3. Proposed Method
Our proposed method for clustering density functions, based on the Hellinger distance, works as follows:
1.. For each , , discretize the curve and get a grid of m equidistant points . These points correspond to heights of the curve in m equally spaced points of the support of the function (in this paper, we use 200 and 300 equally spaced points throughout the support of the function. The determination of the amount of points depends on computational capacity). Thus, to apply this methodology, we recommend selecting equidistant points throughout the support of the distributions of interest in order to capture the shape of the curves. We have found that increasing the numbers of points does not have a notorious effect on the outcome of the algorithm).
2.. Compute the Hellinger distance for all possible combinations of two of these functions. So, for curves and , the distance is as follows:
(1)
where(2)
with and are defined.3.. Build a matrix of distances between all curves using the proposed metric.
4.. Use the hierarchical clustering function
hclust inR with as the distance matrix.5.. Obtain the corresponding dendrogram.
6.. Specify the number of clusters and identify the members in each cluster.
To analyze the performance of our proposed method, we compared it with other algorithms available in the literature under different schemes. Such algorithms included those implemented in the
3. Simulation Study
For simulation purposes, density functions were used as functional data. The theoretical counterpart of these densities was estimated using kernel functions via the
Overlapping distributions. Let X and Y be random variables with density functions and , respectively, both sharing the same support, and define as the γ-th percentile of Z. Then, we can say that f and g overlap if one of the following conditions is satisfied:
-
(a)
∧
-
(b)
∧ .
Overlapping clusters of distributions. A cluster is δ-overlapped with another one if at least of those distributions is overlapped. If two clusters of distributions are not overlapped, it means they are separated.
Initially, the clusters generated contained a finite number of curves following a Normal distribution with pre-specified means and variances for each real group but with a random perturbation of the parameters within clusters. Therefore, three clusters with curves per cluster were considered. All of the cases above were simulated considering two and three clusters of curves.
We also considered asymmetrical features to compare the behavior of the four methods under different scenarios and thus modified the previously described simulation process (clusters of Normal distributions) using Gamma and Beta distributions within clusters. A further major consideration in all cases was the use of an atypical curve for each simulation scenario to evaluate the performance of the method in the presence of atypical curves. To study the effectiveness of the method under evaluation, we considered separated and -overlapped clusters of distributions with a known value of . The average overlapping rate in each scenario is presented in Table 2.
In all the simulation scenarios, the existence of clusters of curves was ensured by testing the equality of their means. Figure 4 shows one of the different simulation scenarios used to compare the four methods. These scenarios were considered for two and three clusters, with curves per cluster, and an equal number of curves. In each simulation scenario, a total of 1000 replicates were run. To validate the clusters found at each replicate, we used the routines in the
Given a set of n elements and two partitions of S to be compared: , a partition of S into r subsets, and , a partition of S into s subsets, define the following:
a: the number of pairs of elements in S that are in the same set in X and in the same set in Y.
b: the number of pairs of elements in S where both elements belong to different clusters in both partitions.
c: the number of pairs of elements in S where both elements belong to the same cluster in partition X but not in partition Y.
d: the number of pairs of elements in S that are in different sets in X and in the same set in Y.
So,
For some, , , , , , , , , .
The Rand index (R), the Jaccard coefficient (J), the Fowlkes–Mallows index (F.M), and the correct classification rate (CCR) are thus calculated as follows:
All these validation measures have a value between 0 and 1, where 0 indicates that no pair of points is shared by the two data clusters, and 1 means that the data clusters are exactly the same.
Results
The main results of this simulation study are shown in Figure 5 and Figure 6 (R codes and data associated with this article can be found at
The assertion above is justified from two perspectives. First, when clusters are separated or overlapped or an atypical curve is present in simulated data from Normal or Gamma populations, there is evidence that our method performs better than the other ones. Second, when two clusters from a Gamma population are generated, the
Although the
Figure 6 illustrates the performance measures of the four methods compared in this study to obtain optimal partitions. Overall, all the methods studied here exhibit poor performance compared to that of our proposal. The result seems to be a promissory topic for further exploration, especially for clusters generated from a Beta distribution because this is a very flexible distribution that can take different shapes. The closed support of a Beta distribution leads to seeking clusters under more concentrated possibilities, turning every method into a more overlapping one and making it more challenging to detect the true differences within these curves. Thus, our proposed method exhibits the best performance compared to the other options considered here.
4. Illustration
This section shows the performance of our proposed method when faced with a real data set.
Computer’s Lifetime Elicitation
The purpose of this application is to segment prior distributions of experts regarding a desktop computer’s average operation time in months; since its purchase until its first failure. Those prior distributions were obtained in an elicitation process described in Barrera and Correa (2008) [32]. They carried out this elicitation process through an interview-survey administered to six experts; however, their study was conducted taking into account only the person with the most exceptional expertise. In this study, we work with all prior distributions, considering the possible existence of clusters.
Figure 7 shows the elicited prior distributions (plotted via the free-hand method) of the six experts, which resulted from the interview-survey conducted in Barrera-Causil and Correa (2008) [32]. We can observe that all the distributions have a similar distributional shape, but one of them (in green) has a very different shape with respect to the others. Table 3 shows the experts’ years of experience in the ICT area (expertise) and years in their current job. Expert 3 exhibits the highest level of expertise.
Applying our proposed clustering method, Figure 8 presents the cluster dendrogram of the six experts, where we can observe that the prior belief of Expert 3 is far from that of the other participants. Furthermore, note that, in Table 3, this expert exhibits the highest level of expertise. For that reason, we decided to analyze the prior distributions using two clusters: one comprising Experts 1, 2, 4, 5 and 6; and another one containing Expert 3 alone.
In this situation, the following questions arise: Is the average of all distributions the best representation of prior beliefs? Is it possible to reach a consensus among these experts? Regarding the last question, in some cases a consensus can be quickly reached, but in many others this process can be tedious or such agreement cannot be reached.
As for the first question, when priors are very different, the average of these distributions can result in almost an uniform distribution that spoils the complex process of elicitation. Thus, we recommend creating homogeneous groups of these prior distributions and analyzing them simultaneously. To analyze each group simultaneously, we calculate the functional mean and approximate the average curves to a Gamma distribution by cluster. By using the
The functional mean is obtained in each cluster.
1000 observations are generated from the distribution that is proportional to the the functional mean in each cluster.
This distribution is approximated to a gamma distribution using the
fitdistr function with the generated samples.
Accordingly, for the cluster formed by Experts 1, 2, 4, 5, and 6, the fitted prior distribution is , with standard errors of the estimated parameters 0.39011009 and 0.02798532, respectively. For Expert 3, the fitted distribution is , with standard errors of the estimated parameters 2.04588737 and 0.06345412, respectively (the Gamma distribution was chosen as it is a distribution typically used to model lifetime data (Shanker et al., 2016 [33]). A method based on goodness-of-fit metrics (e.g., AIC) via a GAMLSS approach (see Rigby and Stasinopoulos, 2005) [34] could have suggested distributions other than the Gamma as better fits. Thus, the selection of the distribution has a direct effect on estimates of location, scale, and shape and, in turn, on any subsequent inferential analysis). Figure 9 shows prior distributions, by cluster, of a computer’s average operation time in months (since its acquisition until its first failure) considering the functional mean in each group.
It is clear there are two clusters of experts with their distributions exhibiting differences in location and scale. As mentioned above, combining both distributions is indeed possible (e.g., by averaging); however, doing so does not reflect the level of expertise in each cluster (see Figure 2). Thus, each cluster should be analysed separately. Next, we illustrate a parallel analysis for each cluster. Table 4 shows the lifetime of the desktop computers assessed by the experts in the study by Barrera-Causil and Correa (2008) [32]. By resorting to the exponential component of the Gamma distribution, it is possible to estimate 95% probability intervals for the average time of the first occurrence of a physical failure in a desktop computer (in months). Thus,
where x are the observed data, and is the parameter of interest.For cluster 1, as the average operation time in months is , then we have that follows an inverted Gamma distribution. Then, the prior of can be expressed as
Given that the likelihood of the Exponential distribution is
The posterior distribution is therefore:
where .We know that for future observations y, the predictive distribution is given by
In order to obtain the predictive Bayesian PDF, 1000 samples from the posterior distribution were obtained. Thus, the Bayesian predictive density for cluster 1, can be expressed as:
where .A 95% probability interval of the average time of the first failure time of a desktop computer (in months) in cluster 1, could be calculated by plugging the 1000 samples from the posterior distribution into the previous equation. The interval in cluster 1 is therefore . By performing the same procedure for cluster 2, the resulting 95% probability interval is . These probability intervals can then be used by a panel of experts to make case-related decisions.
5. Discussion
In this paper, we proposed a simple method to segment expert knowledge that can be effectively applied to functional data or problems with large volumes of data (data mining). Implementing such method, we built, after a discretization process, a distance matrix between curves using the Hellinger distance.
A simulation study considering different scenarios was presented. In such study, our proposed method performed better in almost all scenarios than the k-means and agglomerative nesting clustering algorithms for functional data (as implemented in the
Along the same lines of this proposal, further research topics in this field include sensitivity analyses when different initial values are considered. Another interesting area, based on the results of the analyses above, is developing a robust clustering method to generate data partitions, namely a clustering method that performs well in different conditions. Additionally, other distributions (e.g., Poisson), samples, and cluster sizes should be considered in future simulation studies to further assess the performance of our method.
Finally, we believe our proposed method has implications for two specific areas; combination of expert knowledge and machine learning (ML), and Bayesian inference. Supervised (e.g., classification) and unsupervised (e.g., clustering) ML methods have gained momentum in current research. For example, it has been shown that ML-based classification ensembles perform better than experts in segregating viable from non-viable embryos [35] and that (evolutionary) clustering algorithms enable characterising COVID-19-related textual tweets in order to assist institutions in making decisions [36]. In general, researchers pitch ML-based analyses against human expert knowledge because they see no value in it or because do not know how to integrate into the analysis pipeline. Evidence, however, suggests that blending expert knowledge with ML analyses leads to better predictions [37,38,39,40]. We believe our proposal can be used to contribute in this rather under-investigated area.
A challenge in Bayesian inference is to determine the effect of different prior distributions on a parameter to be estimated. In this regard, Stefan et al., (2021) [20] found that the variability in prior distributions from six experts did not affect the qualitative conclusions associated to estimated Bayes factors. Their results thus suggest that although there can be quantitative differences between the elicited prior distributions, the overall qualification associated to those quantities is not altered. We believe, though, that if the elicited distributions show considerable fluctuation in terms of location, scale and shape, those distributions need to be subjected to clustering in order to segregate levels of expertise (see Figure 2, Figure 7 and Figure 9). The method proposed herein can serve for such purpose.
6. Conclusions
Although the proposed clustering method is designed for data obtained from experts’ belief curves, we demonstrated it can be applied to data sets that do not correspond to distributions or curves (we show this in three supplementary examples). In conclusion, the proposed method offers encouraging applications using other types of data sets encountered in data mining problems.
Author Contributions
Conceptualisation and methodology, C.B.-C., J.C.C. and F.T.-A.; validation and formal analysis, C.B.-C., F.M.-R.; investigation, C.B.-C.; resources, C.B.-C., F.M.-R.; writing—original draft preparation, C.B.-C., F.M.-R. and A.Z.; writing—review and editing, C.B.-C., F.M.-R. and A.Z.; visualisation, C.B.-C.; supervision, J.C.C.; project administration, C.B.-C.; funding acquisition, C.B.-C. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The data presented in this study are openly available in
Acknowledgments
C.B.-C. thanks Instituto Tecnológico Metropolitano -ITM- for their support. The authors dedicate this work to the memory of prof. Francisco Torres-Avilés. Figure 1 was designed by Daniela Álvarez Lleras (
Conflicts of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
CCR | Correct classification rate |
FC | Functional clustering |
FDA | Functional data analysis |
F.M | Fowlkes–Mallows |
ICT | Information and communications technology |
J | Jaccard coefficient |
R | Rand index |
Appendix A
This section presents the performance of the proposed method in different scientific fields. Note that this proposal was focused on elicited prior distributions, but it could be applied using other types of data sets.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures and Tables
Figure 1. Illustration of two experts having different points of view in regards to a specific phenomenon.
Figure 2. Level of certainty of two hypothetical experts (red and black curves) and their mean (green curve). Note that, when the location parameters of the experts’ distributions are different, those of the resulting mean distribution (green curve) may be considerably different from the experts’ actual opinion.
Figure 3. Illustration of the expert knowledge elicitation workflow. E.K.E.(me) = expert knowledge elicitation from multiple experts; P.D.g = generation of prior distributions; P.D.c = clustering of prior distributions; C1,C2…Cn = cluster 1 to cluster n; P.iC1,P.iC2…P.iCn = prior or posterior inference for each cluster; D.M.(e) = decision making with a panel of experts. The long dash arrow from D.M.(e) to E.K.E.(me) indicates that workflow can be iterated if required. The goal of the workflow is to create clusters of experts and analyze the clusters separately by means of prior or posterior estimations. A final decision is reached via a panel of experts. Step IV is unique to our proposal in that it shows where the clustering of elicited prior distributions takes place in the EKE workflow.
Figure 4. Illustration of some assessed simulation schemes where three clusters of n=30 curves with a (a) separated, (b) overlapped, and (c) atypical structure are generated. Clusters of Normal(μ,σ2) and Gamma(α,β) distributions are presented in the first and second columns, respectively. In (c), atypical curves are shown in blue.
Figure 5. Performance indexes when two and three clusters of overlapped and separated (A) Normal(μ,σ2), (B) Gamma(α,β), and (C) Beta(α,β) distributions are generated.
Figure 6. Performance indexes when two and three clusters of (A) Normal(μ,σ2), (B) Gamma(α,β), and (C) Beta(α,β) distributions with one atypical curve are generated.
Figure 9. Prior distributions by cluster of experts. Expert 3 is represented by the dotted PDF and the remaining experts are represented by the solid PDF.
Methods for clustering functional data.
Method | Description |
---|---|
Functional k-means | It is an extension of the traditional k-means clustering algorithm for |
kmeans.fd | functional data analysis. This method uses a special metric for |
functional data. | |
Agglomerative hierarchical | It computes agglomerative hierarchical clustering of the data set using |
clustering Agnes (average method) | the average method, where the distance between two clusters is the |
average of the dissimilarities between the points in one cluster and the | |
points in the other cluster. A complete description of agglomerative | |
nesting (agnes) can be found in chapter 5 of Kaufman and | |
Rousseeuw (1990) [30]. | |
Agglomerative hierarchical | It computes agglomerative hierarchical clustering of the data set using |
clustering Agnes (Ward’s method) | Ward’s method, where the agglomerative criterion is based on the |
optimal value of an objective function, which is usually the sum of | |
squared errors. |
Average value of in all simulation scenarios.
Scenarios | -Overlapping Average | |
---|---|---|
2 Clusters | 3 Clusters | |
Overlapped Normal curves | 0.953 | 0.961 |
Overlapped Gamma curves | 0.974 | 0.969 |
Overlapped Beta curves | 0.955 | 0.959 |
Separated Normal curves | 0.052 | 0.066 |
Separated Gamma curves | 0.078 | 0.071 |
Separated Beta curves | 0.064 | 0.067 |
Experts’ years of experience in the ICT area and years in their current job. Note for example that while E3 and E4 have the same number of years of expertise, E3 has more experience working the computers being assessed in this study.
E1 | E2 | E3 | E4 | E5 | E6 | |
---|---|---|---|---|---|---|
Expertise | 2.5 | 5 | 17 | 17 | 4 | 7 |
Current Job | 0.5 | 1.3 | 4 | 2 | 1 | 2.5 |
First failure times (in months) of 72 desktops. Out of the 72 desktop computers, 17 desktops were reported to have failed at the time of the study. The failure times of the remaining 55 desktops were denoted by (this indicates that these computers had not failed by the time when the study was carried out).
14,07 | 17,80 | 19,43 | 21,33 | 24,60 | 28,97 | 29,63 | 33,73 | 37,60 | 37,67 | 40,87 | 52,40 |
53,97 | 60,57 | 64,27 | 65,43 | 65,43 | |||||||
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Expert knowledge elicitation (EKE) aims at obtaining individual representations of experts’ beliefs and render them in the form of probability distributions or functions. In many cases the elicited distributions differ and the challenge in Bayesian inference is then to find ways to reconcile discrepant elicited prior distributions. This paper proposes the parallel analysis of clusters of prior distributions through a hierarchical method for clustering distributions and that can be readily extended to functional data. The proposed method consists of (i) transforming the infinite-dimensional problem into a finite-dimensional one, (ii) using the Hellinger distance to compute the distances between curves and thus (iii) obtaining a hierarchical clustering structure. In a simulation study the proposed method was compared to k-means and agglomerative nesting algorithms and the results showed that the proposed method outperformed those algorithms. Finally, the proposed method is illustrated through an EKE experiment and other functional data sets.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 Grupo de Investigación Davinci, Facultad de Ciencias Exactas y Aplicadas, Instituto Tecnológico Metropolitano -ITM-, Medellín 050034, Colombia;
2 Escuela de Estadística, Facultad de Ciencias, Universidad Nacional de Colombia sede Medellín, Medellín 050034, Colombia;
3 Centre for Change and Complexity in Learning, The University of South Australia, Adelaide 5000, Australia;
4 Departamento de Matemática y Ciencia de la Computación, Universidad de Santiago de Chile, Santiago 9170020, Chile