Abstract
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining Basic and Advanced Technigues. Morgan Kaufmann Publishers, San Francisco (2001)
Alsabti, K., Ranka, S., Singh, V.: An Efficient k-Means Clustering Algorithm. In: 11th International Parallel Processing Symposium (1998)
Wirth, N.: Algorithms and Data Structures. Prentice Hall, Inc., Englewood cliffs (1986)
Weiss, M.A.: Data Structures and Algorithm Analysis. The Benjamin/Cummings Publishing Company, Inc., Redwood City (1992)
Aho, A.V., Hopcroft, J.E., Ullman, J.D.: Data structures and algorithms. Addison-Wesley, Reading (1983)
Karyapis, G., Han, E.H., Kumar, V.: CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. IEEE Computer (1999) (special Issue on Data Analysis and Mining)
Han, J., Kamber, M., Tung, A.: Spatial Clustering Methods in Data Mining: A Survey. In: Miller, H., Han, J. (eds.) Geographic Data Mining and Knowledge Discovery, p. 21. Taylor and Francis, Abington (2001)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster Validity Methods: Part 1. ACM SIGMOD Record 31(2) (June 2002)
Kanungo, T., David, M.M., Nathan, S.N., Piatko, C.D.: A Local Search Approximation Algorithm for k-Means Clustering. ACM Press, New York (2002)
Kanungo, T., Mount, D.M., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Analysis and Machine Intelligence 24, 881–892 (2002)
Estivill-Castro, V., Fast, J.Y.: Robust General Purpose clustering algorithms. Data Mining and knowledge discovery 8, 127–150 (2004)
Grade, V., Oliver: Multidimensional Access Methods. ACM Computing Surveys 30(2) (June 1998)
Guttman, A.: R-Trees: Adynamic Index Structure for Spatial Searching. In: SIGMOD Conference 1984. ACM, Boston (1984)
Oehler, K.L., Gray, R.M.: Combining Image Compression and Classification Using Vector Quantization. Ieee Transactions On Pattern Analysis And Machine Intelligence 17(5) (May 1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ali, M., Li, X., Dong, Z.Y. (2005). Efficient Spatial Clustering Algorithm Using Binary Tree. In: Gallagher, M., Hogan, J.P., Maire, F. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2005. IDEAL 2005. Lecture Notes in Computer Science, vol 3578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508069_39
Download citation
DOI: https://doi.org/10.1007/11508069_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26972-4
Online ISBN: 978-3-540-31693-0
eBook Packages: Computer ScienceComputer Science (R0)