Automatic Extraction of Clusters from Hierarchical Clustering Representations

Sander, Jörg; Qin, Xuejie; Lu, Zhiyong; Niu, Nan; Kovarsky, Alex

doi:10.1007/3-540-36175-8_8

Jörg Sander⁵,
Xuejie Qin⁵,
Zhiyong Lu⁵,
Nan Niu⁵ &
…
Alex Kovarsky⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1540 Accesses

Abstract

Hierarchical clustering algorithms are typically more effective in detecting the true clustering structure of a data set than partitioning algorithms. However, hierarchical clustering algorithms do not actually create clusters, but compute only a hierarchical representation of the data set. This makes them unsuitable as an automatic pre-processing step for other algorithms that operate on detected clusters. This is true for both dendrograms and reachability plots, which have been proposed as hierarchical clustering representations, and which have different advantages and disadvantages. In this paper we first investigate the relation between dendrograms and reachability plots and introduce methods to convert them into each other showing that they essentially contain the same information. Based on reachability plots, we then introduce a technique that automatically determines the significant clusters in a hierarchical cluster representation. This makes it for the first time possible to use hierarchical clustering as an automatic pre-processing step that requires no user interaction to select clusters from a hierarchical cluster representation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

Holistic Assessment of Structure Discovery Capabilities of Clustering Algorithms

Overlapping Hierarchical Clustering (OHC)

References

Ankerst M., Breunig M. M., Kriegel H.-P., Sander J.: “OPTICS: Ordering Points To Identify the Clustering Structure”, Proc. ACM SIGMOD, Philadelphia, PA, 1999, pp 49–60.
Google Scholar
Ester M., Kriegel H.-P., Sander J., Xu X.: “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”, Proc. KDD’96, Portland, OR, 1996, pp. 226–231.
Google Scholar
Hinneburg A., Keim D.: “An Efficient Approach to Clustering in Large Multimedia Databases with Noise”, KDD’98, New York City, NY, 1998.
Google Scholar
Jain A. K., Dubes R. C.: “Algorithms for Clustering Data,” Prentice-Hall, Inc., 1988.
Google Scholar
Knorr E. M., Ng R.T.: “Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining,” IEEE Trans. on Knowledge and Data Engineering, Vol. 8, No. 6, December 1996, pp. 884–897.
Article Google Scholar
Kaufman L., Rousseeuw P. J.: “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley & Sons, 1990.
Google Scholar
MacQueen J.: “Some Methods for Classification and Analysis of Multivariate Observations”, Proc. 5th Berkeley Symp. Math. Statist. Prob., 1967, Vol. 1, pp. 281–297.
MathSciNet Google Scholar
Ng R. T., Han J.: “Efficient and Effective Clustering Methods for Spatial Data Mining”, Proc. VLDB’94, Santiago, Chile, Morgan Kaufmann Publishers, San Francisco, CA, 1994, pp. 144v155.
Google Scholar
Sheikholeslami G., Chatterjee S., Zhang A.: “WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases”, Proc. VLDB’98, New York, NY, 1998, pp. 428–439.
Google Scholar
Sibson R.: “SLINK: an optimally efficient algorithm for the single-link cluster method”, The Computer Journal Vol. 16, No. 1, 1973, pp. 30–34.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, AB, Canada, T6G 2E8
Jörg Sander, Xuejie Qin, Zhiyong Lu, Nan Niu & Alex Kovarsky

Authors

Jörg Sander
View author publications
You can also search for this author in PubMed Google Scholar
Xuejie Qin
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Nan Niu
View author publications
You can also search for this author in PubMed Google Scholar
Alex Kovarsky
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Korea Advanced Institute of Science and Technology, 373-1 Koo-Sung Dong, Yoo-Sung Ku, Daejeon, 305-701, Korea
Kyu-Young Whang
Department of Statistics, Seoul National University, Sillimdong Kwanakgu, Seoul, 151-742, Korea
Jongwoo Jeon
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak P.O. Box 34, Seoul, 151-742, Korea
Kyuseok Shim
Department of Computer Science and Engineering, University of Minnesota, 200 Union St SE, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sander, J., Qin, X., Lu, Z., Niu, N., Kovarsky, A. (2003). Automatic Extraction of Clusters from Hierarchical Clustering Representations. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_8

Download citation

DOI: https://doi.org/10.1007/3-540-36175-8_8
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Automatic Extraction of Clusters from Hierarchical Clustering Representations

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

Holistic Assessment of Structure Discovery Capabilities of Clustering Algorithms

Overlapping Hierarchical Clustering (OHC)

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatic Extraction of Clusters from Hierarchical Clustering Representations

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

Holistic Assessment of Structure Discovery Capabilities of Clustering Algorithms

Overlapping Hierarchical Clustering (OHC)

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation