An Efficient Cell-Based Clustering Method for Handling Large, High-Dimensional Data

Chang, Jae-Woo

doi:10.1007/3-540-36175-8_29

Jae-Woo Chang⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1160 Accesses

Abstract

In this paper, we propose an efficient cell-based clustering method for handling a large of amount of high-dimensional data. Our clustering method provides an efficient cell creation algorithm using a space-partitioning technique and a cell insertion algorithm to construct clusters as cells with more density than a given threshold. To achieve good retrieval performance on clusters, we also propose a new filtering-based index structure using an approximation technique. In addition, we compare the performance of our cell-based clustering method with the CLIQUE method in terms of cluster construction time, precision, and retrieval time. The experimental results show that our clustering method achieves better performance on cluster construction time and retrieval time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jiawei Han, Micheline Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 2000.
Google Scholar
Zhang T., Ramakrishnan R., Linvy M., “BIRCH: An Efficient Data Clustering Method for Very Large Databases”, Proc. ACM Int. Conf. on Management of Data, ACM Press, 1996, pp. 103–114.
Google Scholar
Ester M., Kriegel H.-P., Sander J., Xu X., “A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”, Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1996.
Google Scholar
Wang W., Yang J., Muntz R., “STING: A Statistical Information Grid Approach to Spatial Data Mining”, Proc. 23rd Int. Conf. on Very Large Data Bases, Morgan Kaufmann, 1997.
Google Scholar
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan, “Automatic Subspace Clustering of High Dimensional Data Mining Applications”, Proc. ACM SIGMOD Int. Conf. on Management of Data, 1998, pp. 94–105.
Google Scholar
http://www.almaden.ibm.com/cs/quest.

Download references

Author information

Authors and Affiliations

Dept. of Computer Engineering, Research Center of Industrial Technology, Engineering Research Institute, Chonbuk National University, Chonju, Chonbuk, 561-756, South Korea
Jae-Woo Chang

Authors

Jae-Woo Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Korea Advanced Institute of Science and Technology, 373-1 Koo-Sung Dong, Yoo-Sung Ku, Daejeon, 305-701, Korea
Kyu-Young Whang
Department of Statistics, Seoul National University, Sillimdong Kwanakgu, Seoul, 151-742, Korea
Jongwoo Jeon
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak P.O. Box 34, Seoul, 151-742, Korea
Kyuseok Shim
Department of Computer Science and Engineering, University of Minnesota, 200 Union St SE, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chang, JW. (2003). An Efficient Cell-Based Clustering Method for Handling Large, High-Dimensional Data. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_29

Download citation

DOI: https://doi.org/10.1007/3-540-36175-8_29
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics