Finding centroid clusterings with entropy-based criteria

Hu, Tianming; Sung, Sam Yuan

doi:10.1007/s10115-006-0017-7

Finding centroid clusterings with entropy-based criteria

Short Paper
Published: 23 March 2006

Volume 10, pages 505–514, (2006)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Tianming Hu¹ &
Sam Yuan Sung²

97 Accesses
6 Citations
Explore all metrics

Abstract

We investigate the following problem: Given a set of candidate clusterings for a common set of objects, find a centroid clustering that is most compatible to the input set. First, we propose a series of entropy-based distance functions for comparing various clusterings. Such functions enable us to directly select the local centroid from the candidate set. Second, we present two combining methods for the global centroid. The selected/combined centroid clustering is likely to be a good choice, i.e., top or middle ranked in terms of closeness to the true clustering. Finally, we evaluate their effectiveness on both artificial and real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Dietterich TG (2001) Ensemble methods in machine learning. In: Proceedings of the 2nd international workshop on multiple classifier systems, pp 1–15
Fayyad UM, Reina C, Bradley PS (1998) Initialization of iterative refinement clustering algorithms. In: Proceedings of the 14th international conference on machine learning, pp 194–198
Fisher D (1996) Iterative optimization and simplification of hierarchical clusterings. J Artif Intell Res 4:147–180
MATH Google Scholar
Fred ALN, Jain AK (2002) Evidence accumulation clustering based on the k -means algorithm. In: Proceedings of the joint IAPR international workshops on structural, syntactic, and statistical pattern recognition, pp 442–451
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning, pp 148–156
Frossyniotis D, Likas A, Stafylopatis A (2004) A clustering method based on boosting. Pattern Recog Lett 25(6):641–654
Article Google Scholar
Ghosh J (2002) Multiclassifier systems: back to the future. In: Proceedings of the 3rd international workshop on multiple classifier systems, pp 1–15
Gordon A (1999) Classification, 2nd edn. Chapman and Hall/CRC Press, Boca Raton
MATH Google Scholar
Grabmeier J, Rudolph A (2002) Techniques of cluster algorithms in data mining. Data Min Knowl Discov 6(4):303–360
Article MathSciNet Google Scholar
Hubert LJ, Arabie P (1985) Comparing partitions. J Class 2:63–76
Article Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Article Google Scholar
Johnson E, Kargupta H (1999) Collective, hierarchical clustering from distributed, heterogeneous data. In: Large-scale parallel KDD systems. Springer-Verlag, Berlin Heidelberg New York, pp 221–244
Kargupta H, Huang W, Johnson E (2001) Distributed clustering using collective principal component analysis. Knowl Inform Syst J 3:422–448
Article MATH Google Scholar
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Article MathSciNet Google Scholar
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Google Scholar
Ross S (1998) A first course in probability, 5th edn. Prentice-Hall, Engelwood Cliffs
MATH Google Scholar
Schapire R (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Google Scholar
Sharkey A (1999) Combining artificial neural nets. Springer-Verlag, Berlin Heidelberg New York
MATH Google Scholar
Strehl A, Ghosh J (2002) Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, DongGuan University of Technology, DongGuan, 523808, People's Republic of China
Tianming Hu
Department of Computer Science, South Texas College, McAllen, Texas, 78501, USA
Sam Yuan Sung

Authors

Tianming Hu
View author publications
You can also search for this author in PubMed Google Scholar
Sam Yuan Sung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianming Hu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, T., Sung, S.Y. Finding centroid clusterings with entropy-based criteria. Knowl Inf Syst 10, 505–514 (2006). https://doi.org/10.1007/s10115-006-0017-7

Download citation

Received: 14 January 2005
Revised: 21 June 2005
Accepted: 21 July 2005
Published: 23 March 2006
Issue Date: November 2006
DOI: https://doi.org/10.1007/s10115-006-0017-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding centroid clusterings with entropy-based criteria

Abstract

Access this article

Similar content being viewed by others

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

A Quality Metric for K-Means Clustering Based on Centroid Locations

A Modified k-Means Clustering Procedure for Obtaining a Cardinality-Constrained Centroid Matrix

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Finding centroid clusterings with entropy-based criteria

Abstract

Access this article

Similar content being viewed by others

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

A Quality Metric for K-Means Clustering Based on Centroid Locations

A Modified k-Means Clustering Procedure for Obtaining a Cardinality-Constrained Centroid Matrix

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation