Abstract:
External validity measures in cluster analysis evaluate how well the clustering results match to a prior knowledge about the data. However, it is always intractable to ge...Show MoreMetadata
Abstract:
External validity measures in cluster analysis evaluate how well the clustering results match to a prior knowledge about the data. However, it is always intractable to get the prior knowledge in the practical problem of unsupervised learning, such as cluster analysis. In this paper, we extend the external validity measures for both hard and soft partitions by a resampling method, where no prior information is needed. To lighten the time burden caused by the resampling method, we incorporate two approaches into the proposed method: (i) extending external validity measures for soft partitions in a computational time of O(M2N); (ii) an efficient sub-sampling method with time complexity of O(N). The proposed method is then applied and reviewed in determining the number of clusters for the problem of unsupervised learning, cluster analysis. Experimental results has demonstrated the proposed method is very effective in solving the number of clusters.
Date of Conference: 22-24 November 2011
Date Added to IEEE Xplore: 02 January 2012
ISBN Information: