Abstract
Most clustering methods for data mining applications do not work efficiently when dealing with large, high-dimensional data. This is caused by so-called ‘curse of dimensionality’ and the limitation of available memory. In this paper, we propose an efficient clustering method for handling of large amounts of high-dimensional data. Our clustering method provides both an efficient cell creation and a cell insertion algorithm. To achieve good retrieval performance on clusters, we also propose a filtering-based index structure using an approximation technique. We compare the performance of our clustering method with the CLIQUE method. The experimental results show that our clustering method achieves better performance on cluster construction time and retrieval time.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2000)
Ng, R.T., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. of Int. Conf. on Very Large Data Bases, pp. 144–155 (1994)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Zhang, T., Ramakrishnan, R., Linvy, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: Proc. of ACM Int. Conf. on Management of Data, pp. 103–114 (1996)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. of Int. Conf. on Knowledge Discovery and Data Mining, pp. 226-231(1996)
Wang, W., Yang, J., Muntz, R.: STING: A Statistical Information Grid Approach to Spatial Data Mining. In: Proc. of Int. Conf. on Very Large Data Bases, pp. 186-195 (1997)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data Mining Applications. In: Proc. of ACM Int. Conf. on Management of Data 94-105
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chang, JW., Kim, YK. (2004). An Efficient Clustering Method for High-Dimensional Data Mining. In: Bazzan, A.L.C., Labidi, S. (eds) Advances in Artificial Intelligence – SBIA 2004. SBIA 2004. Lecture Notes in Computer Science(), vol 3171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28645-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-28645-5_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23237-7
Online ISBN: 978-3-540-28645-5
eBook Packages: Springer Book Archive