Comparison of fuzzy clustering methods in economic freedom ranking in Asia-Pacific

Economic freedom can be defined as freedom in which individuals can perform their economic activities freely without being exposed to the pressures and constraints. The aim of the studies on the classification of countries according to their economic freedoms is to determine the place of the countries in the world or in the continent where they are located. In this way, the status of the countries with sustainable growth and high welfare is determined. In this study, it is aimed to rank Asian countries according to economic freedom data. In contrast to many classifications and sorting studies, the present study attempts to determine the best sorting method by comparing multiple methods. As a result of the economic freedoms published by the Heritage Foundation every year, the conditions of Asian countries between 2015-2019 were determined. Fuzzy C-Means, Gath-Geva and Gustafson-Kessel methods, which are the three most commonly used methods, were used in the fuzzy clustering analysis. The results obtained from all fuzzy clustering methods were compared and interpreted with the results of the Heritage Foundation year by year. According to all analysis results, it can be said that the Fuzzy C-means method is more successful for Economic Freedom data and classification studies. According to the Fuzzy C-Means method, the three best Asian countries were Hong Kong, New Zealand and Australia respectively.


INTRODUCTION
According to the Heritage Foundation, economic freedom is defined as the fundamental right of every person to control his or her labor and property. In an economically free society; while it is recognized that individuals have the freedom to work, produce, consume and invest as they wish; labor, capital, and goods (URL-1).
The Economic Freedom Index shows the positive relationship between economic freedom and various social and economic goals. In this context; The concepts of public health, environmental cleanliness, wealth per capita, human development, poverty eradication, and democracy are closely related to economic freedom. Economic freedom brings more prosperity to countries. The Economic Freedom Index documents and maintains the positive relationship between economic freedom and various positive social and economic goals. Studies on measuring economic freedoms have been carried out by the Heritage Foundation and The Wall Street Journal. The index of economic freedom was formed according to the definitions of Adam Smith. These indices are also considered as indicators of sustainable growth and prosperity levels for countries. A separate calculation method is available for each independent index. The higher the score, the more economically the country is interpreted as being freer than other countries. With economic freedom, today's most important issues such as healthier societies, cleaner environments, more wealth per capita, human development, democracy and poverty eradication can be realized. Heritage Freedom Index, published by Heritage Foundation, evaluates countries over 100 points. According to the ratings, countries fall under the categories of "not free", "mostly not free", "partially free", "mostly free" and "free".
Economic freedom is necessary not only in relation to human dignity but also in order to adjust the changing preferences of producers and consumers in response to market forces and to ensure economic growth. Increased economic freedoms will help individuals move more comfortably in both production and consumption processes. Thus, individuals will be able to act more easily in their economic decisions. As regards the link between economic freedom and economic growth, there are intense efforts to identify the key elements of economic freedom. On the other hand, studies are conducted to determine the relationship between economic performance and economic freedom of countries such as income level and inequality.
Today, countries are classified as developed, developing and undeveloped countries according to their development levels. The role of economic development is an important issue in order to be among the developed countries. The realization of economic development should be ensured through economic growth by making radical changes in country policies and taking important steps. Economic growth labor, capital and so on. While it is expressed by quantitative changes such as production factors, the development includes quantitative changes as well as qualitative changes in the country. For the qualitative changes mentioned here, we can mention economic, social, cultural and political changes. It is necessary to ensure economic growth and development in the country with the necessary efforts for individuals and societies to reach more prosperity. In order to realize economic growth, countries should focus on issues that will raise their level of economic freedom. Economic freedoms are very important for economic growth and development.
In countries with low levels of economic freedom, the restrictions imposed by the state to solve possible problems and to keep the market under control are more pronounced. The control and restrictions established by the state on the economy of the country make economic activities difficult in the market. Furthermore, there is a possibility that the control and restraint power on the market may be abused in line with certain interests. This situation in the market undermines the confidence of individuals living in the country and reduces the desire for economic activity. The same is true for foreign investors. The investor, who is in search of a new investment area in the international arena, primarily considers the cost and profit maximization of the investment. Foreign investors will give up the idea of investing in a country where they cannot maximize profits due to economic constraints. Therefore, low-level countries are not attractive to foreign investors in terms of economic freedoms.
In this study, index ranking for Asian-Pacific countries was made with the help of the data of economic freedom used by the Heritage Foundation. The Fuzzy C-Means, Gath-Geva and Gustafson-Kessel methods of fuzzy clustering analysis methods were compared between 2015-2019 and it was determined which results were close to the results of Heritage foundation.

Clustering analysis
Clustering analysis is a method that provides the classification of the units examined in research by grouping them according to their similarities, revealing the common features of the units and making general definitions about these classes. The aim is; to classify the ungrouped data according to their similarities and to assist the researcher in obtaining useful, summative information. The objective of cluster analysis is the classification of objects according to similarities among them and classify the data into groups (Balasko, Abonyi & Feil, 2005).
Clustering is a method of unsupervised learning, and a common technique for statistical data analysis used in many fields, including machine learning, data mining, pattern recognition, image analysis, bioinformatics, and marketing.
Fuzzy Clustering Analysis is an analysis technique developed based on the fuzzy logic theory. In this approach, clusters emerge as an appropriate method if they are not distinctly separated from each other or if some units in their membership are undecided in cluster membership. Fuzzy sets are functions that determine each unit between 0 and 1, which is defined as the membership of the unit in the set. Very similar units are located in the same cluster according to the degree of high membership (Höppner, Klawonn, Rudolf & Runkler, 1999).
The structure of the cluster and the algorithm used to specify which of these distance criteria will be used. Some of the convenient characteristics of fuzzy clustering are providing membership values that are convenient to comment on, flexible on the usage of distance and when some of the membership values are known, they can be combined with numeric optimization (Naes & Mevik, 1999). The biggest advantage of fuzzy clustering over crisp clustering methods is that it provides more detailed information on the data. But on the other hand, there will be too much output when there are too many individuals and clusters so it will difficult to summarize and classify the data. Moreover, the use of fuzzy clustering algorithms is preferred if there is uncertainty in the data (Abonyi & Feil, 2007).

Fuzzy C-Means algorithm (FCM)
Fuzzy C-Means algorithm forms the basis of all fuzzy clustering techniques that depend on the objective function. This algorithm was first developed in Dunn (1973). Bezdek (1974) then generalized this fuzzy objective function by defining a weighting exponent. The latest version of the algorithm recognizes the spherical shape of points in m-dimensional space (Bezdek, 1981). The distance between objects and cluster centers is measured by Euclidean distance given in Equation.1. (Höppner, Klawonn, Rudolf & Runkler, 1999). x represents the position observation value in the coordinated system, and i v represents the cluster center for each cluster which is called prototypes. In the beginning, it is necessary to know the actual number of clusters and the membership degrees of the individuals beforehand to be able to put this technique into practice. But in practice, it is difficult to know these parameters before the application. For these, it is possible to find these values through the method of trial and error or through some techniques developed. The objective function used for this clustering method is as follows: This function is the weighted least square function. n parameter represents the number of observations, and c represents the number of clusters. m jk u is the membership of j x in k-th cluster,   v u J , value is a measure of the total of all weighted error sum of squares. There is a constraint to which this objective function applies. According to the fuzzy logic principle, each data belongs to each set with a membership value ranging from   0,1 . The sum of membership values of all data for all classes should be "1" (Ruspini, 1973). In equation.3, it symbolizes; the number of the cluster with c, fuzziness index with m, process ending criteria with  and membership degrees matrix with U of FCM algorithm generate cluster prototypes at random. By taking means of these values, the membership degrees matrix is calculated as given in Equation.4 (Sintas, Cadenas & Martin, 1999).

Gustaffson-Kessel algorithm (GK)
The Gustafson-Kessel algorithm is a fuzzy clustering algorithm developed to identify ellipse-like clusters instead of spherical clusters. The Fuzzy C-Means method does not give good results in such clusters. After that, Gustafson & Kessel (1979) used Mahalanobis distance instead of Euclidean distance in the fuzzy clustering method. In this algorithm, compared to the Fuzzy C-Means algorithm, in addition to the cluster centers, each cluster has a symmetric and positively defined matrix A. This matrix causes the norm x Ax  for each set. Here, taking these matrices randomly can cause distances to be small. In order to prevent the objective function from being minimized by the matrix whose inputs are approximately zero, fixed volume clusters are required such that det (A) = 1. Here only cluster forms are variable, cluster sizes are not variable (Gustafson & Kessel, 1979 Here A is a symmetric and positively defined matrix,   The essence of the covariance matrix i F is that it provides information about the shape and orientation of the set. It is found by the ratio of the ellipsoid axes to the length of the cluster. The Gustafson-Kessel algorithm is used to detect clusters along linear subspaces of the data field. These clusters are represented by flat hyper ellipsoids which can be seen as hyperplanes. The eigenvector corresponding to the smallest eigenvalue value determines the normal of the hyperplane and can be used to calculate linear models from the covariance matrix to the optimal location.

Gath-Geva algorithm (GG)
The Gath and Geva algorithm is a more advanced version of the Gustafson-Kessel algorithm, which also takes into account the density and size of clusters. This approach is not based on optimizing the objective function. The Gath-Geva algorithm is an experimental method based on the blur of the maximum likelihood estimator. The main idea is to assume that the data points are normally distributed p-dimensional (Gath & Geva, 1989).
The fuzzy maximum similarity estimator distance function is given as follows: The distance function is also selected in proportion to the posterior probabilities. If the distance is small, this means a high probability for memberships and a low probability if the distance is large (Oliveira & Pedrycz, 2007).
Unlike the fuzzy c-means algorithm and the Gustafson -Kessel algorithm, the Gath -Geva algorithm is not based on an objective function, but a blur of statistical estimators. The best feature is that when the initial cluster centers are selected in good quality, they can yield accurate division results in unequal variable properties and densities. In addition, this algorithm successfully detects both the fuzzy c-means and all clusters that the Gustafson -Kessel algorithm can find. However, the Gath -Geva algorithm becomes more reasonable for the local minimal with increasing complexity. Either parts of the Gath -Geva algorithm may be very different for different initialization of prototypes or floating-point overflows that can easily occur due to the exponential function. Therefore, it is acceptable to use a modified exponential function that provides linearly incremental values when the arguments are an overflow.

Cluster validity indices
One of the main problems in clustering analysis is what optimal number of clusters will be. This is always a problem in all clustering analysis methods like crisp clustering, fuzzy clustering or soft set clustering, etc. So we have to use cluster validity indices.
If we have no prior knowledge about the number of classes, it is hard to make the right decision on the number of classes. Cluster validity indices tell us the quality of partition that was found and enables us to determine optimal partitions. For these, validity indices can be used to search for the optimal number of clusters in the data set that is not known in advance. In Literature, there are so many validity indices for detecting the optimal number of clusters (In classical clustering nearly 10 indices are studying but in fuzzy clustering, there are more than 70 and researchers still working on it). In this article, it is used Artificial Neural Network based validity index which introduced by Erilli, Yolcu, Egrioglu, Aladag & Oner (2011).

Index calculation
Index calculations are made as given in Erilli (2018) work. Index values were calculated by using the data published on the Heritage Foundation website. These variables are based on 12 quantitative and qualitative factors, grouped into four broad categories of economic freedom: 1. Rule of Law (property rights, government integrity, judicial effectiveness) 2. Government Size (government spending, tax burden, fiscal health) 3. Regulatory Efficiency (business freedom, labor freedom, monetary freedom) 4. Open Markets (trade freedom, investment freedom, financial freedom) The proposed index is calculated with three different fuzzy clustering analyses. The suggested steps are given as follows: i. The variables forming each index value (they must be in the same cluster) are clustered by all three fuzzy clustering analysis methods. ii. After applying all three fuzzy clustering analysis methods, cluster membership degrees of each observation is calculated for all methods separately (Cluster membership degrees take place between 0 and 1 for each observation). iii. Cluster membership degrees obtained by each fuzzy clustering method are sorted from small to large within the clusters. This is the ranking of the countries' freedom in the relevant year. These operations are carried out separately for each year and the continent ranking is determined for the relevant periods., In this study, the method which gives the best percentage will be selected by looking at the rankings obtained with the values written above and the correlation values with Heritage Foundation rankings.

RESULTS AND DISCUSSION
In the application part, the Economic Freedom data published by the Heritage Foundation were researched separately for 2015-2019 according to the mentioned methods. First, countries were clustered according to years of fuzzy clustering analysis. The countries clustered separately for each method for each year are listed for each cluster from big to small according to their cluster membership degrees. According to the three methods specified for each year, Asian countries were ranked according to their economic freedoms and the results were compared according to Heritage results. MATLAB. 2007.b package program was used in all applications. Statistically, significant value is taken 0.05 for all analyses. According to the results of fuzzy clustering analysis, cluster numbers for each year are given in Table 1 for each method. As for Table 1, only in 2015, all methods find the same number of clusters. This can be explained by the fact that the slightest change in the data allows a change in the level of fuzziness.
For the comparison, correlation coefficients between rankings were examined for each year. The best method was determined according to the rank correlation values of the first 10 countries and all countries. Table.2 shows the correlation values of all countries and only the top 10 country rankings with Heritage for each year's data.
When the results in Table 2 are examined, it is seen that the method which has the highest correlation value with Heritage scores belongs to the FCM method. In addition, the most statistically significant correlation values belong to FCM.
The top 10 country rankings according to three different fuzzy clustering methods and heritage foundations are given in the appendix for each year. According to Heritage results, Hong Kong has been at the top of the tables every year. According to the results of FCM, it was in the first place for 3 years and according to the other two methods, it was never in the first place. In contrast to the Heritage ranking, the results of the three methods included 25 different countries in the top 10. In Heritage scores, this number is only 12. In Table 3, the correlation values of the calculated rankings for five different years are given. As can be easily understood from the table, we can say that FCM is the best method with the highest correlations. coefficients differed according to each cluster and allowed the ranking of 10 different countries in the top 5. Thus, more variability arising from the data was included in the analysis. It can be argued that the results of fuzzy clustering analysis, where the rankings of the countries change almost every year, may give results with more pronounced annual performances.
The most interesting result is that Australia ranked first for all 3 methods according to the results calculated in 2015 and 2016. It is thought that the growth data of the Australian economy in 2015 and 2016 are better than their competitors Hong-Kong, New Zealand, Taiwan and Singapore, lower value-added values such as inflation and unemployment, and finding the best country compared to fuzzy methods. Also, it is seen that some countries are in the top ranks in some years and in the bottom ranks in some years. For example, countries Samoa for 2015, Fiji for 2017 and Tonga for 2019 are in the top places in fuzzy systems and they are in the lower places according to Heritage results.
According to Heritage Foundation 2019 results, 4 of the top 5 countries are Asian countries. The most important reasons for this are that financial freedom, trade freedom, investment freedom, and business freedom in Asian countries is better than the world countries. The fact that all these values are higher in Asian countries increases the data turbidity. Therefore, it is considered that fuzzy methods will give more successful results in cases of high instability.

Conclusion
In this study, the 3 most commonly used methods in Fuzzy clustering analysis are used for listing the Asia-Pacific countries for economic freedoms for years 2015-2109. According to the results of the analysis, the FCM method was determined as the most successful method.
In the study, firstly, clustering analysis was performed for all methods. For each cluster of the countries divided into clusters, year-by-year rankings were made according to cluster membership degrees. Then, the correlation values between Heritage and Fuzzy Clustering methods were examined. The highest correlation value for the year 2015 was found to be 0.80 according to the FCM method. According to only the top 10 countries, this ratio increased to 0.88. In general, the ranking results obtained by FCM are similar to Heritage results by 73% for the whole continent and 83% for the first 10 rankings.
In particular, the fact that the number of observations and the number of variables used is close to each other, and that many of the variables have very close values for countries, increase the uncertainties in the results obtained. In such cases, fuzzy methods are proposed instead of classical methods. In this study, fuzzy clustering methods were compared and the best results were tried to be determined. According to the results obtained in the study, it can be said that economic freedom rankings made with fuzzy clustering methods are successful.
The principles of economic freedom are a sure guide, but only a guide. What truly will matter are the creative solutions to pressing world problems that are certain to flow from people all over the world.

Recommendations
In the classification and grouping studies, the data structure which is the subject of the analysis has a direct effect on the results. By comparing the results obtained using different methods, it may be easier to determine the appropriate analysis method. In future studies, economic freedoms can be made by using a soft set, rough set, grey set or near set clustering methods instead of fuzzy clustering. In this way, more method comparisons will be made and this can help to find the best method.