Skip to main content
Log in

Combined constraint-based with metric-based in semi-supervised clustering ensemble

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Recently, both semi-supervised clustering and cluster ensemble have received tremendous attention due to their accurate and reliable performance. There are mainly two kinds of existing semi-supervised clustering algorithms called constraint-based and metric-based. In this paper, we present a semi-supervised clustering ensemble approach which takes both pairwise constraints and metric measure into account. Firstly, under the assistance of supervised information included pairwise constraints and labeled data, the approach generates different base clustering partitions respectively using constraint-based semi-supervised clustering and metric-based semi-supervised clustering, in which the latter develops a new metric function. Given the spatial particularity of image pixels, the metric considers spatial distribution of surrounding pixels besides inherent features of pixels in the process of image feature extraction. And then the target clustering is obtained by integrating those base clustering partitions into an ensemble function. Finally, we conduct experimental verification on general data sets and image data sets, and compare clustering performance of our approach with those of other approaches. Both theoretical analysis and experimental results demonstrate that the proposed method produces considerable improvement in clustering accuracy and yields superior clustering results over a number of representative clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Wu L, Hoi S C H, Jin R, Zhu J, Yu N (2010) Learning bregman distance functions for semi-supervised clustering. IEEE Trans Knowl Data Eng 24(3):478–491

    Article  Google Scholar 

  2. Strehl A, Ghosh J, Cardie C (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  3. Du L, Shen YD, Shen Z, Wang J, Xu Z (2013) A self-supervised framework for clustering ensemble. Lect Notes Comput Sci 7923:253–264

    Article  Google Scholar 

  4. Hao ZF, Wang LJ, Cai RC, Wen W (2015) An improved clustering ensemble method based link analysis. World Wide Web-internet & Web. Inform Syst 18(2):185–195

    Google Scholar 

  5. Yu Z, Chen H, You J, Wong HS, Liu J, Li L et al (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinform 11(4):727–740

    Article  Google Scholar 

  6. Yu Z, Luo P, You J, Wong HS, Leung H, Wu S et al (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714

    Article  Google Scholar 

  7. Xiong S, Azimi J, Fern XZ (2014) Active learning of constraints for semi-supervised clustering. IEEE Trans Knowl Data Eng 26(1):43–54

    Article  Google Scholar 

  8. Wang D, Gao X, Wang X (2015) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46:1–12.

    Google Scholar 

  9. Yan Y, Chen L, Nguyen D T (2012) Semi-supervised clustering with multi-viewpoint based similarity measure. IEEE Int Jt Conf Neural Netw (IJCNN), 24, 1–8.

    Google Scholar 

  10. Yin X, Shu T, Huang Q (2012) Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl-Based Syst 35(15):304–311

    Article  Google Scholar 

  11. Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. The 21st International Conference on Machine Learning, 81–88.

  12. Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320–1333

    Article  MATH  Google Scholar 

  13. Lin L, Qu W, Yu X (2009) A semi-supervised clustering algorithm based on rough reduction. International Conference on Chinese Control and Decision Conference, 5427–5431.

  14. Zhang H, Lu J (2009) Semi-supervised fuzzy clustering: a kernel-based approach. Knowl-Based Syst 22(6):477–481

    Article  Google Scholar 

  15. Arzeno N, Vikalo H (2015) Semi-supervised affinity propagation with soft instance-level constraints. IEEE Trans Pattern Anal Mach Intell 37(5):1041–1052

    Article  Google Scholar 

  16. Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: Proceedings of the nineteenth international conference on machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 27–34

  17. Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: Kok JN, Koronacki J, Mantaras RL, Matwin S, Mladenič D, Skowron A (eds) Machine learning: ECML 2007. Lecture notes in computer science, vol 4701. Springer, Berlin, Heidelberg, pp 674–682

  18. Grira N, Crucianu M, Boujemaa N (2008) Active semi-supervised fuzzy clustering. Pattern Recognit 41(5):1834–1844

    Article  MATH  Google Scholar 

  19. Zeng H, Cheung Y M, Member S (2012) Semi-supervised maximum margin clustering with pairwise constraints. IEEE Trans Knowl Data Eng 24(5):926–939

  20. Ding S, Jia H, Zhang L, Jin F (2014) Research of semi-supervised spectral clustering algorithm based on pairwise constraints. Neural Comput Appl 24(1), 211–219.

    Article  Google Scholar 

  21. Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning. ACM, New York, pp 209–216

  22. Weinberger KQ, Blitzer J, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244

    MATH  Google Scholar 

  23. Huang M, Chen Y, Liu J, Ji W (2014) A large margin nearest cluster metric based semi-supervised clustering algorithm for brain fibers. International Conference on Game Theory for Networks, 1–5.

  24. Nguyen N, Caruana R (2007) Consensus clusterings. In: Proceedings of the 7th IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 607–612

  25. Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Proceedings of 16th International Conference on Information Fusion (FUSION). IEEE, Istanbul, Turkey, pp 438–444

  26. Wang H, Qi J, Zheng W, Wang M (2010) Semi-supervised cluster ensemble based on binary similarity matrix. IEEE International Conference on Information Management and Engineering, 251–254.

  27. Chen D, Yang Y, Wang H, Mahmood A (2013) Convergence analysis of semi-supervised clustering ensemble. International Conference on Information Science and Technology (ICIST), 783–788.

  28. Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. In: Lecture notes computer science, vol 3316, pp 1229–1234

  29. Bertsekas DP (1976) On the goldstein-levitin-polyak gradient projection method. IEEE Trans Autom Control 21(2):174–184

    Article  MathSciNet  MATH  Google Scholar 

  30. Na Y, Yu J (2013) A pixel similarity method for spectral clustering image segmentation. J Nanjing Univ Nat Sci 2:159–168

    Google Scholar 

  31. Fowlkes C, Martin D, Malik J (2003) Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 54–61

  32. Cour T, Bénézit F, Shi J (2005) Spectral segmentation with multiscale graph decomposition. IEEE Comput Soc Conf Comput Vis Pattern Recog 2:1124–1131

    Google Scholar 

  33. Martin D, Fowlkes C, Malik J (2004) Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5):530–549

    Article  Google Scholar 

  34. Sun T, Ren Z, Ding S (2011) Region-based semi-supervised clustering image segmentation. Int Conf Nat Comput 4:1855–1858.

    Google Scholar 

  35. Lichman M (2013) UCI Machine Learning Repository. University of California, Irvine, CA School of Information and Computer Science. doi:http://archive.ics.uci.edu/ml.

  36. Kuncheva L, Hadjitodorov S B (2004) Using diversity in cluster ensembles. IEEE Int Conf Syst Man Cybern 2:1214–1219.

    Google Scholar 

  37. Arbeláez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Software Eng 33(5):898–916

    Google Scholar 

  38. Wang F, Zhang C, Li T (2009) Clustering with local and global regularization. IEEE Trans Knowl Data Eng 21(12):1665–1678

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their insightful comments and suggestions to significantly improve the quality of this paper. This research reported in this paper is supported by the National Natural Science Foundation of China (Nos. 61165009, 61663004, 61262005, 61363035, 61365009), the Guangxi Natural Science Foundation (2016GXNSFAA380146, 2014GXNSFAA118368), the Direct Fund of Guangxi Key Lab of Multi-source information Mining and Security (16-A-03-02),the Guangxi “Bagui Scholar” Teams for Innovation and Research Project, Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixin Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, S., Li, Z. & Zhang, C. Combined constraint-based with metric-based in semi-supervised clustering ensemble. Int. J. Mach. Learn. & Cyber. 9, 1085–1100 (2018). https://doi.org/10.1007/s13042-016-0628-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0628-6

Keywords

Navigation