An Efficient Clustering Method for High-Dimensional Data Mining

Chang, Jae-Woo; Kim, Yong-Ki

doi:10.1007/978-3-540-28645-5_28

Jae-Woo Chang²⁰ &
Yong-Ki Kim²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3171))

Included in the following conference series:

Brazilian Symposium on Artificial Intelligence

2849 Accesses
1 Citations

Abstract

Most clustering methods for data mining applications do not work efficiently when dealing with large, high-dimensional data. This is caused by so-called ‘curse of dimensionality’ and the limitation of available memory. In this paper, we propose an efficient clustering method for handling of large amounts of high-dimensional data. Our clustering method provides both an efficient cell creation and a cell insertion algorithm. To achieve good retrieval performance on clusters, we also propose a filtering-based index structure using an approximation technique. We compare the performance of our clustering method with the CLIQUE method. The experimental results show that our clustering method achieves better performance on cluster construction time and retrieval time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio

Article 18 November 2019

Accelerating k-Means Clustering with Cover Trees

Centroid-Based Hierarchy Preserving Clustering Algorithm Using Lighthouse Scanning

References

Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Ng, R.T., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. of Int. Conf. on Very Large Data Bases, pp. 144–155 (1994)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Google Scholar
Zhang, T., Ramakrishnan, R., Linvy, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: Proc. of ACM Int. Conf. on Management of Data, pp. 103–114 (1996)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. of Int. Conf. on Knowledge Discovery and Data Mining, pp. 226-231(1996)
Google Scholar
Wang, W., Yang, J., Muntz, R.: STING: A Statistical Information Grid Approach to Spatial Data Mining. In: Proc. of Int. Conf. on Very Large Data Bases, pp. 186-195 (1997)
Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data Mining Applications. In: Proc. of ACM Int. Conf. on Management of Data 94-105
Google Scholar
http://www.almaden.ibm.com/cs/quest

Download references

Author information

Authors and Affiliations

Dept. of Computer Engineering, Research Center for Advanced LBS Technology, Chonbuk National University, Chonju, Chonbuk, 561-756, South Korea
Jae-Woo Chang & Yong-Ki Kim

Authors

Jae-Woo Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Ki Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instituto de Informática, UFRGS, Porto Alegre, RS, Brasil
Ana L. C. Bazzan
Intelligent Systems Laboratory LSI, Center of Technology, Federal University of Maranao UFMA, Bacanga Campus, 65080-040, Sao Luis, MA, Brazil
Sofiane Labidi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chang, JW., Kim, YK. (2004). An Efficient Clustering Method for High-Dimensional Data Mining. In: Bazzan, A.L.C., Labidi, S. (eds) Advances in Artificial Intelligence – SBIA 2004. SBIA 2004. Lecture Notes in Computer Science(), vol 3171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28645-5_28

Download citation

DOI: https://doi.org/10.1007/978-3-540-28645-5_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23237-7
Online ISBN: 978-3-540-28645-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics