Abstract
Clustering is a division of data into groups of similar objects, with respect to a set of relevant attributes (features) of the analyzed objects. Classical partitioning clustering methods, such as k-means algorithm, start with a known set of objects, and all features are considered simultaneously when calculating objects’ similarity. But there are numerous applications where an object set already clustered with respect to an initial set of attributes is altered by the addition of new features. Consequently, a re-clustering is required. We propose in this paper an incremental, k-means based clustering method, Core Based Incremental Clustering (CBIC), that is capable to re-partition the objects set, when the attribute set increases. The method starts from the partitioning into clusters that was established by applying k-means or CBIC before the attribute set changed. The result is reached more efficiently than running k-means again from the scratch on the feature-extended object set. Experiments proving the method’s efficiency are also reported.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aeberhard, S., Coomans, D., de Vel, O.: The Classification Performance of RDA. Tech. Rep. 92–01, Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland (1992)
CorMac Technologies Inc, Canada: Discover the Patterns in Your Data, http://www.cormactech.com/neunet
Demiroz, G., Govenir, H.A., Ilter, N.: Learning Differential Diagnosis of Eryhemato-Squamous Diseases using Voting Feature Intervals. Artificial Intelligence in Medicine
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1998)
Jain, A., Murty, M.N., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Şerban, G.: A Programming Interface for Non-Hierarchical Clustering. Studia Universitatis “Babeş-Bolyai”, Informatica XLX(1) (to appear)
Şerban, G., Câmpan, A.: Core Based Incremental Clustering. Studia Universitatis “Babeş-Bolyai”, Informatica XLXI(2) (to appear)
Wolberg, W., Mangasarian, O.L.: Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In: Proceedings of the National Academy of Sciences, U.S.A., vol. 87, pp. 9193–9196 (1990)
Wu, F., Gardarin, G.: Gradual Clustering Algorithms. In: Proceedings of the 7th International Conference on Database Systems for Advanced Applications (DASFAA 2001), pp. 48–57 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Şerban, G., Câmpan, A. (2005). Incremental Clustering Using a Core-Based Approach. In: Yolum, p., Güngör, T., Gürgen, F., Özturan, C. (eds) Computer and Information Sciences - ISCIS 2005. ISCIS 2005. Lecture Notes in Computer Science, vol 3733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11569596_87
Download citation
DOI: https://doi.org/10.1007/11569596_87
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29414-6
Online ISBN: 978-3-540-32085-2
eBook Packages: Computer ScienceComputer Science (R0)