Abstract
This paper presents a genetic programming (GP) to data clustering. The aim is to accurately classify a set of input data into their genuine clusters. The idea lies in discovering a mathematical function on clustering regularities and then utilize the rule to make a correct decision on the entities of each cluster. To this end, GP is incorporated into the clustering procedures. Each individual is represented by a parsing tree on the program set. Fitness function evaluates the quality of clustering with regard to similarity criteria. Crossover exchanges sub-trees between parental candidates in a positionally independent fashion. Mutation introduces (in part) a new sub-tree with a low probability. The variation operators (i.e., crossover, mutation) offer an effective search capability to obtain the improved quality of solution and the enhanced speed of convergence. Experimental results demonstrate that the proposed approach outperforms a well-known reference.
This research was supported by the MKE, Korea under the ITRC NIPA-2011-(C1090-1121-0008).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3) (September 1999)
Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Park, N.H., Ahn, C.W., Ramakrishna, R.S.: Adaptive Clustering Technique Using Genetic Algorithms. IEICE Trans. Inf. and Syst. E88-D(12), 2880–2882 (2005)
Koza, J.R.: Genetic Programming On the programming of Computers by Means of Natural Selection. The MIT Press (1992)
Langdon, W.B.: Genetic Programming + Data Structures = Automatic Programming. The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers (1998)
Mitchell, T.M.: Machine Learning. Computer Science Series. McGRAW-HILL International Editions (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahn, C.W., Oh, S., Oh, M. (2011). A Genetic Programming Approach to Data Clustering. In: Kim, Th., et al. Multimedia, Computer Graphics and Broadcasting. MulGraB 2011. Communications in Computer and Information Science, vol 263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27186-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-27186-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27185-4
Online ISBN: 978-3-642-27186-1
eBook Packages: Computer ScienceComputer Science (R0)