A fragment-based iterative consensus clustering algorithm with a robust similarity

Chung, Chih-Heng; Dai, Bi-Ru

doi:10.1007/s10115-013-0667-1

A fragment-based iterative consensus clustering algorithm with a robust similarity

Regular Paper
Published: 11 June 2013

Volume 41, pages 591–609, (2014)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Chih-Heng Chung¹ &
Bi-Ru Dai¹

387 Accesses
6 Citations
Explore all metrics

Abstract

The consensus clustering technique combines multiple clustering results without accessing the original data. Consensus clustering can be used to improve the robustness of clustering results or to obtain the clustering results from multiple data sources. In this paper, we propose a novel definition of the similarity between points and clusters. With an iterative process, such a definition of similarity can represent how a point should join or leave a cluster clearly, determine the number of clusters automatically, and combine partially overlapping clustering results. We also incorporate the concept of “clustering fragment” into our method for increased speed. The experimental results show that our algorithm achieves good performances on both artificial data and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Al-Razgan M, Domeniconi C, Barbará D (2008) Random subspace ensembles for clustering categorical data. In: Al-Razgan M, Domeniconi C, Barbará D (eds) Supervised and unsupervised ensemble methods and their applications. Springer, Berlin/Heidelberg, pp 31–48
Borah B, Bhattacharyya DK (2008) DDSC: a density differentiated spatial clustering technique. J Comput 3(2):72–79
Article Google Scholar
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining
Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the international conference on machine learning, vol 20(1), p 186
Fred AL, Jain AK (2002) Data clustering using evidence accumulation. In: Proceedings of the 16th international conference on, pattern recognition, pp 276–280
Gionis A, Mannila H, Tsaparas P (2005) Clustering aggregation. In: Proceedings of the international conference on data, engineering, pp 341–352
Goder A, Filkov V (2008) Consensus clustering algorithms: comparison and refinement. In: ALENEX’08: proceedings 10th workshop on algorithm engineering and experiments, pp 109–117
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Article Google Scholar
Iam-On N, Boongeon T, Garrett S, Price C (2012) A link-based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425
Article Google Scholar
Karypis G, Kumar V (1998) Multilevel k-way partitioning scheme for irregular graphs. J Parallel Distrib Comput 48(1):96–129
Google Scholar
Karypis G, Han EH, Kumar V (1999) CHAMELEON: a hierarchical clustering using dynamic modeling. Computer 32(8):68–75
Article Google Scholar
Lance GN, Williams WT (1967) A general theory of classificatory sorting strategies 1. Hierarchical systems. Comput J 9(4):373–380
Article Google Scholar
Meilă M (2003) Comparing clusterings by the variation of information. In: Meilă M (ed) Learning theory and kernel machines. Springer, Berlin/Heidelberg, pp 173–187
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Google Scholar
Nguyen N, Caruana R (2007) Consensus clusterings. In: ICDM’07: proceedings of the 2007 seventh IEEE international conference on data mining. IEEE Computer Society, Washington, DC, 607–612
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Article Google Scholar
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
MATH MathSciNet Google Scholar
Topchy A, Jain AK, Punch W (2003) Combining multiple weak clusterings. In: Proceedings of IEEE international conference on data mining, pp 331–338
Topchy A, Jain AK, Punch W (2004) A mixture model of clustering ensembles. In: Proceedings of SIAM conference on data mining, pp 379–390
Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881
Article Google Scholar
Verma D, Meila M (2005) A comparison of spectral clustering algorithms. Technical report, Department of CSE University of Washington Seattle, WA 98195–2350
Wu O, Zhu M, Hu W (2009) Fragment-based clustering ensembles. In: Proceedings of the 18th ACM conference on information and knowledge management, ACM, pp 1795–1798
Wu O, Hu W, Maybank SJ, Zhu M, Li B (2012) Efficient clustering aggregation based on data fragments. IEEE Trans Syst Man Cybern Part B Cybern 42(3):913–926
Article Google Scholar
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Chih-Heng Chung & Bi-Ru Dai

Authors

Chih-Heng Chung
View author publications
You can also search for this author in PubMed Google Scholar
Bi-Ru Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bi-Ru Dai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chung, CH., Dai, BR. A fragment-based iterative consensus clustering algorithm with a robust similarity. Knowl Inf Syst 41, 591–609 (2014). https://doi.org/10.1007/s10115-013-0667-1

Download citation

Received: 16 October 2011
Revised: 01 May 2013
Accepted: 28 May 2013
Published: 11 June 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10115-013-0667-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fragment-based iterative consensus clustering algorithm with a robust similarity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Consensus function based on cluster-wise two level clustering

MCC: a Multiple Consensus Clustering Framework

$$SC^2$$ : A Selection-Based Consensus Clustering Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A fragment-based iterative consensus clustering algorithm with a robust similarity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Consensus function based on cluster-wise two level clustering

MCC: a Multiple Consensus Clustering Framework

$$SC^2$$ : A Selection-Based Consensus Clustering Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now