Abstract
Although clustering is an unsupervised learning approach, most clustering algorithms require setting several parameters (such as the number of clusters, minimum density or distance threshold) in advance to work properly. In this paper, we eliminate the necessity of setting the minimum cluster size parameter of the Randomized Gravitational Clustering algorithm proposed by Gomez et al. Basically, the minimum cluster size is estimated using a heuristic that takes in consideration the functional relation between the number of clusters and the clusters with at least a given number of points. Then a data point’s region of action (region of the space assigned to a point) is defined and a cluster refinement process is proposed in order to merge overlapping clusters. Our experimental results show that the proposed algorithm is able to deal with noise, while finding an appropriate number of clusters without requiring a manual setting of the minimum cluster size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenun Press (1981)
Cormer, T., Leiserson, C., Rivest, R.: Introduction to Algorithms. McGraw-Hill (1990)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: 2nd Intl. Conf. on Knowledge Discovery and Data Mining (KDD 1996), pp. 226–231. AAAI (1996)
Gomez, J., Dasgupta, D., Nasraoui, O.: A New Gravitational Clustering Algorithm. In: 3rd SIAM Intl. Conf. on Data Mining (SDM 2003), vol. 3, pp. 83–94. Society for Industrial and Applied Mathematics (2003)
Gomez, J., Nasraoui, O., Leon, E.: RAIN – Data Clustering Using Randomized Interactions between Data Points. In: 3rd Intl. Conf. on Machine Learning and Applications (ICMLA 2004), pp. 250–255 (2004)
Han, J., Kamber, M.: Data Mining – Concepts and Techniques. Morgan Kaufmann (2000)
Jain, A.K.: Data Clustering – 50 Years Beyond K-Means. Pattern Recognition Letters 31(8), 651–666 (2010)
Karypis, G., Han, E., Kumar, V.: CHAMELEON – A Hierarchical Clustering Algorithm Using Dynamic Model. IEEE Computer 32(8), 68–75 (1999)
Kundu, S.: Gravitational Clustering – A New Approach Based on the Spatial Distribution of the Points. Pattern Recognition 32(7), 1149–1160 (1999)
Leon, E., Nasraoui, O., Gomez, J.: A Scalable Evolutionary Clustering Algorithm with Self-Adaptive Genetic Operators. In: 2010 IEEE Congress on Evolutionary Computation (CEC 2010), pp. 4010–4017. IEEE (2010)
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symposium on Mathematics, Statistics, and Probabilities, pp. 281–297. University of California (1967)
Nasraoui, O., Krishnapuram, R.: A Novel Approach to Unsupervised Robust Clustering Using Genetic Niching. In: 9th IEEE Intl. Conf. on Fuzzy Systems (FUZZ IEEE 2000), vol. 1, pp. 170–175 (2000)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons (1987)
Wright, W.E.: Gravitational Clustering. Pattern Recognition 9(3), 151–166 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gomez, J., León, E., Nasraoui, O. (2012). Minimum Cluster Size Estimation and Cluster Refinement for the Randomized Gravitational Clustering Algorithm. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-34654-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34653-8
Online ISBN: 978-3-642-34654-5
eBook Packages: Computer ScienceComputer Science (R0)