Content area
The European Union devotes a lot of attention to the development of knowledge economies and the main result of this concern is the Europe 2020 strategy. From this point of view, the present paper aims to provide a clearer picture of who's who in the European Union with respect to the development of knowledge based economies is concerned. This paper shows the full results of the Cluster Analysis, which gives them the panel of the European Union from the perspective of development of knowledge economies. The results show three groups of countries and allow a summary of several features for each group. Furthermore, this research also provides an additional quantitative support for a previously discovered ranking of the European countries, from the point of view of knowledge economies. It can also be relevant for anyone interested in a professional picture of how the main countries in Europe currently look like from the new economic perspective.
Abstract: The European Union devotes a lot of attention to the development of knowledge economies and the main result of this concern is the Europe 2020 strategy. From this point of view, the present paper aims to provide a clearer picture of who's who in the European Union with respect to the development of knowledge based economies is concerned. In order to attain the above mentioned objective, first of all is presented the state of art regarding knowledge economy development, by presenting the values that a country is able to register at the 8 indicators of the Europe 2020 strategy objectives. Secondly, the paper presents the results of the quantitative research undergone with the help of Principal Components Analysis and Cluster Analysis on the principal components previously found. The two principal components previously discovered and used for this Cluster Analysis are the Shame Factor and the Environmental Concern Factor and they hold 94.43% of nonredundant information from the 8 initial strategy objective indicators. We analysed the 27 European Union countries, plus Switzerland, Norway and Iceland and the level of the Europe 2020's indicators in 2010. This paper shows the full results of the Cluster Analysis, which gives us the panel of the European Union from the perspective of development of knowledge economies. The results show three groups of countries and allow a summary of several features for each group. Furthermore, this research also provides an additional quantitative support for a previously discovered ranking of the European countries, from the point of view of knowledge economies. The main advantage of this study is that it can raise the interest of research scientists interested in knowledge management and comparative management, for it shows the kinds of countries that have best managed so far to achieve the status of a knowledge economy. It can also be relevant for anyone interested in a professional picture of how the main countries in Europe currently look like from the new economic perspective.
Keywords: Europe 2020 strategy, knowledge economy, European Union, cluster analysis
(ProQuest: ... denotes formula omitted.)
1. Introduction and background research
The present paper aims to further pursue a recent reasearch of Fucec (2012) that managed to make a hierarchy of several European countries, using the objectives of the Europe 2020 strategy as criteria. We already argued how and why this strategy is important for the scientific and practical communities in Europe, and throughout the world, so what this new research pursues is a softer image of where each country stands in that hierarchy. Besides, in the previous research paper, the accent was placed on finding synthetic aggregators for the indicators of the objectives in the Europe 2020 strategy and seeking what those indicators can tell us about several countries in Europe and about Romania, in particular. Moreover, this paper aims at furthering the research in order to find out how, besides hierarchy, can we analyse the countries from the Europe 2020's point of view.
As previously mentioned (Fucec, 2012), the Europe 2020 Strategy has 5 main objectives which are expressed by 8 indicators: Employment Rate (%), Gross Expenditure on Research and Development (%), Greenhouse Gas Emissions (year-base 1990 is considered to have the value 100), Renewable Energy (%), Primary oil consumption (tones of oil equivalent), Early Leavers from Education ( %), Tertiary Education Attainment (%), People at Risk of Poverty or Social Exclusion (%). These indicators have been analysed for the 27 ELI member state, including Switzerland, Norway and Iceland, for 2010. Two basic results of this previous analysis are used as a basis for the present paper: the two new indicators found and the resultant hierarchy of the European countries.
According to the author, the two new indicators are "the Irrelevance Factor, metaphorically called Shame Factor, with a low desirable value, and the Environmental Concern factor, with a high desirable value" (Fucec, 2012). The first principal component proved to be strongly negative correlated to the primary oil consumption in the countries, resulting in a necessity to find a name to express the opposite of what this indicator stands for. Assuming that large oil consumption is generally equivalent to development in a country, it was decided that this principal component should be named Irrelevance Factor, because it is desirable for it to have a low value. The second principal component appeared as being strongly positive correlated to the Greenhouse Gas Emissions in a country, so the Factor was expressed as "Environmental Concern".
Based on an aggregated indicator which enveloped the two factors mentioned above, a hierarchy of the countries was made, "ranking them in terms of their evolution towards the stage of a knowledge-based economy. The ranking is based on a quantitative and statistical base and it reffers particularly to knowledge- based economies" (Fucec, 2012). Moreover, the hierarchy is very relevant for ranking each country in the European landscape from the Europe2020's point of view. For example, the first three countries in the ranking were Germany, France and United Kingdom, followed by Italy, Romania and Poland. The leading countries were not such a surprise, but it was a bit unexpected to find Romania or Italy on such a good position. Flowever, these are the countries that are "on the right track toward achieving the status of knowledge-based economies, acording to the European strategy" (Fucec, 2012). The last countries in the ranking proved to be Iceland, Malta and Cyprus. In other words, these countries still have a lot of effort to undergo in order to catch up the progress of the other countries as far as the Europe 2020 Strategy is concerned.
Furthermore, in order to be able to point out similarities between several countries, it would be even more helpful to rank the countries, by determining small groups of countries with high resemblance inside the group and with clear differences between groups. This is exactly where a Cluster Analysis comes in handy, according to Ruxanda (2002). Besides, using such a method may also allow us to make a foresight of the evolution of another country, which was not yet submitted to the analysis. Various cluster analysis have been used so far for territorial comparisons inside a country (Babucea, 2007) or even for corruption estimation studies (lamandi and Voicu-Dorobantu, 2007).
2. Research methodology
The research methodology followed in order to obtain the groups of countries we seek, besides the ranking which we already have, has two major phases. Phase one is the Principal Component Analysis, procedure described by Fucec (2012) and resumed in the above introduction. Phase two is the center of the present research and it represents a Cluster Analysis, based on the principal components identified and described in the previous phase of the research.
As explained by Ruxanda (2001), the Cluster Analysis, as a research instrument, refers to the process of clustering a number of variables or objects, with several features, into groups with two important characteristics: they are internally homogeneous and very heterogeneous in between. In other words, although these groups cannot be obviously remarked, they become obvious after the cluster analysis. It is important to note that we do not know a priori the number of groups or clusters in which the initial objects will be divided, nor the criteria by which they will be grouped.
Anderberg (1973) says that "cluster analysis is a collective term covering a wide variety of techniques for delineating natural groups or clusters in data sets". This turns cluster analysis into a multicriteria optimization problem and a basic part of data mining procedures. Moreover, "applications of cluster analysis are found in virtually all professions" (Romesburg, 2004), but the technique is also widely used in scientific domains such as statistics, economics and data analysis. Due to its nature, cluster analysis involves trial and error, for it can often be necessary to preprocess the data, in order to acquire the desired properties.
The analysis does not involve a single specific algorithm. It starts from one case (one form or one article), which possesses one or more properties of interest and which is represented mathematically as a vector. The case's feature or properties (n) of interest are called attributes and all the cases are included in a crowd. The notion of cluster defines a subset of the initial crowd, which generally denotes the criteria of the classification: intraclass variability should be as small as possible (for the clusters to be homogeneous inside), and variability between clusters should be as large as possible (for clusters to be different from each other). Subsequently, the question is that of evaluating the distance between the cases, the distance between clusters and the two types of variability (intraclass and between classes). This is what justifies the variety in approaches and the lack of a single classical algorithm: various valid ways are available to make the evaluations mentioned above, for the choice remains up to the researcher: "clustering is in the eye of the beholder" (Estivill-Castro, 2002), because no method is the right one.
For this research, we chose to perform a hierarchical cluster analysis, also called connectivity based clustering. In this case, we start with a number of clusters equal to the number of forms that we want to group: each cluster consists of a single form. With each step of the algorithm, the clusters with the smallest distance between them are united. The algorithm ends when we have one final cluster. The clusters sought can be found in the dendogram or horizontal hierarchical tree plot, which is a graphical representation of the clustering process. To calculate the distance between clusters and forms, we used Manhattan or City Block distances (Krause, 1987), which is less affected by errors because it calls on the module function, and which is calculated using the following formula:
..., where (1)
d(X,Y) is the Manhattan distance between X and Y, n is the number of space dimensions or features of our cases, and X and Y are vectors of the following form:
X = (Xi, x2,..., xn); Y = (ya., y2,..., yn).
In our case, the forms or cases submitted to be analysed are the forementioned 30 countries and their attributes or properties of interest for research are the 8 indicators of the EU2020 strategy. As mentioned above, for higher accuracy, we turn to the results of the previous research that used PCA (Principal Component Analysis) for an informational synthesis of the 8 indicators. As mentioned earlier, this analysis obtained two indicators that retain approximately 95% of the information in the initial indicators. Therefore, the two attributes of the countries which will be used to start the cluster analysis will be the new synthetic indicators: the Shame Factor and the Environmental Concern Factor. The data has been retrieved from the Eurostat website (European Commission, 2012), referring to the year 2010, for the 27 ELI member state, plus Switzerland, Norway and Iceland. The current and previous papers used Statistica 8 for data processing. As mentioned above, we performed a Hierarchical Joining Cluster Analysis, using raw data and calculating the Manhattan distance for clustering the cases (or rows).
3. Results and interpretation
After performing the analysis, we found several pieces of useful information that we will use for the discussion in the following sections of the paper, including: the Distance Matrix, the Amalgamation Schedule and the Dendogram or Horizontal Hierarchical Tree Plot.
3.1 The distance matrix
The Distance Matrix is the first and the simplest result of the Cluster Analysis and it is closely related to the second result, the Amalgamation Schedule. The algorithm first calculates all the Manhattan distances between the countries (and puts them in the Distance Matrix) and only after arranging the distances in an ascending scale, it shows the Amalgamation Schedule. What the Distance Matrix solely shows is how far away is each country from another one, from the point of view of this analysis. A part of the Distance Matrix is shown below, in Figure 1. It was not possible to show the whole matrix, because it has 31 rows and 31 columns, so we only selected the first part of it, in order to be able to explain what this matrix shows.
For example, the Manhattan distance between Austria and Greece is 11 and between Malta and Germany is 373. It is obvious that Austria and Greece are much more similar to one another than Malta and Germany are. It is more likely that Austria and Greece will form a cluster and very unlikely that Malta and Germany will. By taking a peek at this matrix, at the whole matrix actually, we can estimate which countries are similar to one another and which countries present the greatest differences. If the whole matrix would be shown here, it would be possible to see that the smallest distance between two countries is 2, which is the distance between Austria and Norway. This is why these two countries will form the first cluster and this is how the Amalgamation Schedule begins.
3.2 The Amalgamation Schedule
This is the result of the cluster analysis which illustrates exactly what steps were taken in order to obtain the tree diagram or otherwise known as a dendogram. The tree diagram is the most relevant result and is discussed in detail below. A part of the Amalgamation Schedule is found in Figure 2.
As shown in Figure 2 and intuited above, the two closest countries are Austria and Norway, because 2.215544 is the smallest Manhattan distance between any two countries submitted to the analysis. What this means is that Austria and Norway are as similar as possible from the viewpoint of the two indicators that show how the countries are doing, as far as the Europe 2020 Strategy is concerned. The next distance is 2.701538, between Latvia and Lithuania. These two countries are the second closest countries from the 30 states analysed, so, at this point we already have two clusters, composed of the countries with most resemblances between them. The third distance shown in Figure 2 is interesting: 3.058608 is the Manhattan distance between the initially composed cluster (Austria and Norway) and Finland. Now, the third cluster is formed among these three countries. As seen in the Figure, in the next two steps, Switzerland and Greece join the cluster. And so on and so far, this is how all the countries come together and give us a number of 2 clusters (or even one final cluster) in the end, from the 30 initial clusters. With each step, the number of clusters decreases, because each country will unite or annex itself to a previous constructed cluster, based on the smallest Manhattan distances between clusters. The following result of the analysis, and the most significant one, will show us how to divide the clusters so as the distance between clusters is high enough for us to say that the clusters are different in an obvious way.
3.3 The Horizontal Hierarchical Tree Plot
So far, we know what the Manhattan distances between all the countries are (from the Distance Matrix) and we also know which steps were followed so that all the countries were included in a cluster. The Horizontal Hierarchical Tree Plot, also called dendogram or, simply, Tree Diagram, is a picture of all that we have explained above and it is shown in Figure 3.
By looking at the Tree Diagram, we can choose how many clusters we wish to define, based on the distance among clusters. In Figure 3 above, we believe that three clusters can be identified, as follows:
* Cluster 1: United Kingdom, Italy, France, Germany, Spain;
* Cluster 2: Poland, Iceland, Malta, Cyprus, Slovenia, Luxembourg, Portugal, Switzerland, Finland, Norway, Austria, Greece, Ireland, Denmark;
* Cluster 3: Netherlands, Romania, Czech Republic, Lithuania, Latvia, Estonia, Hungary, Slovakia, Bulgaria, Sweden and Belgium.
As mentioned in the description of the research methodology, these results are submitted to the subjectivity of the researchers. We could have chosen to define only two clusters, because there is a striking difference between the countries included in cluster 1 and the remaining countries. But for the purpose of our analysis, we found it useful to identify the other two clusters in the remaining 25 countries.
As far as the features of the clusters are concerned, it is important to take a look at the values that the countries have registered at the two indicators used for the analysis. In the case of cluster 1, thing are fairly clear: we have economically strong countries, which seem to manage to keep under control the targets of the European strategy's objectives and have convenient values for the Shame factor and the Environmental Concern factor. Also, according to Fucec (2012), four out of this five countries (Spain excluded) are in the top four ranking positions of the countries, based on the principal components extracted form the 8 indicators of the Europe 2020 Strategy.
Cluster 2 holds 14 countries, as shown above. What these countries have in common is that, except for Poland, they are the last 13 countries in the previously mentioned ranking. In other words, the values they registered for the two indicators, the Shame factor and the Environmental Concern factor, were the least desirable in comparison to the other countries analysed. The situation with Poland is interesting, as is the situation with Spain in cluster 1. Poland is number 6 in the ranking, but now it seems to be included in the worst cluster with regards to the European strategy's objectives. This could happen because several of the initial 8 indicators have very favourable values and other have very unfavourable ones, but since a principal component analysis was previously ran, the situation of Poland and Spain requires further research. As defining features for the countries in this cluster, we can say that they have average employment rates, between 70% and 81,1% (except for Ireland - 65%, Greece - 64%, Malta - 60,1% and Poland 64,60%), and also high values for greenhouse gas emissions, above 102 compared to year base 1990, considered of value 100. These countries are the ones who need to establish sustainable efforts in order to become knowledge economies.
Finally, cluster 3 holds the remaining 11 countries. In the ranking, these countries were placed on the middle positions, from 7 to 17. Again, we have an interesting case here: Romania, number 5 in the ranking. Flowever, this is not as surprising as the case of Poland and Spain. Romania is definitely not a country you would place in a cluster with Germany and the United Kingdom, but it's knowledge economy perspectives are rather positive, since it is placed in the middle-developed cluster, from the point of view of the European strategy.
The countries in cluster 3 appear to have interesting features: in the knowledge economies ranking they are placed in the middle, yet, except for Sweden (78,70%), they have small employment rates, of about 65%. What this tells us is that the employment rate is not a defining feature for basing an economy on knowledge or not. The influence of other factors or indicators can be much more valuable. The greenhouse gas emissions are mostly below 71, compared to year base 1990 of value 100, with the exception of Belgium (92), and Sweden (91). In comparison to cluster 2, the countries in cluster 3 have a higher average of people at risk of poverty and a lower average for the primary oil consumption.
4. Conclusions
The conclusions of this study come to support and detail the findings of previous quantitative research in this field. We found that the order of the countries in the ranking of Fucec (2012) is not random, for the countries can be divided into three groups or clusters, each of them having several features. Two out of 30 cases have shown ambiguous results, so this is an aspect that can be submitted to further investigations. Therefore, further research directions are imperative and will provide interesting information regarding what will happen with Poland and Spain. These countries present no connection between their position in the ranking and the cluster they have been assigned to. Further research is also recommended in the case of Romania, in order to provide an adequate support base for the results of this present analysis and the previous principal components analysis.
In conclusion, in order to be able to say with a quantitative precision where 30 countries of Europe stand regarding the attainment of the Europe 2020 strategy's objectives is a new approach, in the beginning phase, but with one first step successfully completed. Based on the ranking of the countries and on this cluster analysis, a photo of Europe's knowledge economies looks as it is shown above, in Figure 4. The stars in the picture show an ideal knowledge economy, as illustrated from this analysis, and the direction in which all the countries should head on.
References
Anderberg, M. R. (1973) Cluster Analysis for Applications, Academic Press, Michigan.
Babucea, A. G. (2007) "Utilizarea analizei cluster in comparatii terltorlale", Economic Annals of Constantin Brancusi University, [online], No 1/2007, pp 311-316, available from: http://www.utgiiu.ro/revista/ec/pdf/2007- 01/57 Babucea%20Ana-Gabriela.pdf. [accesed 12 February 2013].
Estivill-Castro, V. (2002) "Why so many clustering algorithms: a position paper", ACM SIGKDD Explorations Newsletter, Vol 4, Issue 1, June, pp 65-75. European Commission (2012) Europe 2020 indicators - Headline Indicators, Bruxelles.
Fucec, A. A. (2012) "Is Romania a favourable environment for the development of knowledge-based organizations?", Review of International Comparative Management, Vol 13, Issue 5, December, pp 768 - 111.
lamandi, I. and Voicu-Dorobantu, R. (2007) "Coruptia- un rise pentru România în Uniunea Europeanä", Economic Journal, [online] Year X, No 24, July, pp 15-27, available from: http://www.reiournal.eu/Portals/0/Arhiva/JE%2024/JE%2024%20Voicu-Dorobantu%20lamandi.pdf.[accesed 11 February 2013].
Krause, E. F. (1987) Taxicab Geometry, Addison-Wesley Publishing Company, Dover.
Romesburg, FI. C. (2004) Cluster Analysis For Researchers, Lulu Press, North Carolina.
Ruxanda, G. (2001) Analiza Datelor, Editura ASE, Bucharest.
Ruxanda, G. (2002) "Recunoa§terea formelor în domeniul economico-financiar", Studii $i Cercetàri de Calcul Economic §i Cibernética Economica, No. 2/2002, Year XXXVI.
Adela Anca Fucec and Corina Marinescu (Pirlogea)
The Bucharest University of Economic Studies, Bucharest, Romania
corina [email protected]
Adela Anca Fucec Adela is a 2nd year PhD Student at the Management Doctoral School of the Bucharest University of Eco- nomic Studies, Romania. The author's main focus of research is the knowledge economy and its effects on micro and macro- economic level, especially from the point of view of the quantitative and qualitative managerial efficiency.
Copyright Academic Conferences International Limited Oct 2013