A Clustering Validity Assessment Index

Kim, Youngok; Lee, Soowon

doi:10.1007/3-540-36175-8_60

A Clustering Validity Assessment Index

Youngok Kim⁵ &
Soowon Lee⁵

Conference paper
First Online: 01 January 2003

1201 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Abstract

Clustering is a method for grouping objects with similar patterns and finding meaningful clusters in a data set. There exist a large number of clustering algorithms in the literature, and the results of clustering even in a particular algorithm vary according to its input parameters such as the number of clusters, field weights, similarity measures, the number of passes, etc. Thus, it is important to effectively evaluate the clustering results a priori, so that the generated clusters are more close to the real partition. In this paper, an improved clustering validity assessment index is proposed based on a new density function for intercluster similarity and a new scatter function for intra-cluster similarity. Experimental results show the effectiveness of the proposed index on the data sets under consideration regardless of the choice of a clustering algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jain, A. K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys, Vol. 31. No. 3 (1999) 264–323
Article Google Scholar
Halkidi, M., Batistakis, I., Varzirgiannis, M.: On Clustering Validation Techniques, Journal of Intelligent Information Systems, Vol. 17. No. 2–3 (2001) 107–145
Article MATH Google Scholar
Halkidi, M., Varzirgiannis, M.: Clustering Validity Assesment: Finding the Optimal Partitioning of a Data Set. ICDM (2001) 187–194
Google Scholar
Fasulo, D.: An Analysis of Recent Work on Clustering Algorithms. Technical Report, University of Washington (1999)
Google Scholar
Dunn, J. C.: Well Separated Clusters and Optimal Fuzzy Partitions. J. Cybern. Vol. 4 (1974) 95–104
Article MathSciNet Google Scholar
Davies, DL., Douldin, D. W.: A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 1. No. 2 (1979)
Google Scholar
Sharma, S.C.: Applied Multivariate Techniques. John Willy & Sons (1996)
Google Scholar
Theodoridis, Y.: Spatial Datasets — an unofficial collection. (1996) http://dias.cti.gr/~ytheod/research/datasets/spatial.html

Download references

Author information

Authors and Affiliations

School of Computing, Soongsil University, 1-1 Sang-Do Dong, Dong-Jak Gu, Seoul, 156-743, Korea
Youngok Kim & Soowon Lee

Authors

Youngok Kim
View author publications
You can also search for this author in PubMed Google Scholar
Soowon Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Korea Advanced Institute of Science and Technology, 373-1 Koo-Sung Dong, Yoo-Sung Ku, Daejeon, 305-701, Korea
Kyu-Young Whang
Department of Statistics, Seoul National University, Sillimdong Kwanakgu, Seoul, 151-742, Korea
Jongwoo Jeon
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak P.O. Box 34, Seoul, 151-742, Korea
Kyuseok Shim
Department of Computer Science and Engineering, University of Minnesota, 200 Union St SE, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, Y., Lee, S. (2003). A Clustering Validity Assessment Index. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_60

Download citation

DOI: https://doi.org/10.1007/3-540-36175-8_60
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics