Skip to main content
Log in

A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

For obtaining the more robust, novel, stable, and consistent clustering result, clustering ensemble has been emerged. There are two approaches in clustering ensemble frameworks: (a) the approaches that focus on creation or preparation of a suitable ensemble, called as ensemble creation approaches, and (b) the approaches that try to find a suitable final clustering (called also as consensus clustering) out of a given ensemble, called as ensemble aggregation approaches. The first approaches try to solve ensemble creation problem. The second approaches try to solve aggregation problem. This paper tries to propose an ensemble aggregator, or a consensus function, called as Robust Clustering Ensemble based on Sampling and Cluster Clustering (RCESCC).RCESCC algorithm first generates an ensemble of fuzzy clusterings generated by the fuzzy c-means algorithm on subsampled data. Then, it obtains a cluster-cluster similarity matrix out of the fuzzy clusters. After that, it partitions the fuzzy clusters by applying a hierarchical clustering algorithm on the cluster-cluster similarity matrix. In the next phase, the RCESCC algorithm assigns the data points to merged clusters. The experimental results comparing with the state of the art clustering algorithms indicate the effectiveness of the RCESCC algorithm in terms of performance, speed and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Wang B, Zhang J, Liu Y, Zou Y (2017) Density peaks clustering based integrate framework for multi-document summarization. CAAI Transactions on Intelligence Technology 2(1):26–30

  2. Ma J, Jiang X, Gong M (2018) Two-phase clustering algorithm with density exploring distance measure. CAAI Transactions on Intelligence Technology 3(1):59–64

  3. Deng Q, Wu S, Wen J, Xu Y (2018) Multi-level image representation for large-scale image-based instance retrieval. CAAI Transactions on Intelligence Technology 3(1):33–39

  4. Chakraborty D, Singh S, Dutta D (2017) Segmentation and classification of high spatial resolution images based on Hölder exponents and variance. Geo-spatial Information Science 20(1):39–45

  5. Yang H, Yu L (2017) Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J For Res 28(2):395–402

  6. Li C, Zhang Y, Tu W et al (2017) Soft measurement of wood defects based on LDA feature fusion and compressed sensor images. J For Res 28(6):1285–1292

  7. Alsaaideh B, Tateishi R, Phong DX, Hoan NT, Al-Hanbali A, Xiulian B (2017) New urban map of Eurasia using MODIS and multi-source geospatial data. Geo-spatial Information Science 20(1):29–38

  8. Song XP, Huang C, Townshend JR (2017) Improving global land cover characterization through data fusion. Geo-spatial Information Science 20(2):141–150

  9. Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. The Journal of Machine Learning Research 3:583–617

    MathSciNet  MATH  Google Scholar 

  10. Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) To improve the quality of cluster ensembles by selecting a subset of base clusters. Journal of Experimental & Theoretical Artificial Intelligence 26(1):127–150

    Article  Google Scholar 

  11. Mondal S, Banerjee A (2015) ESDF: ensemble selection using diversity and frequency. Eprint Arxiv 68(1):10–12

    Google Scholar 

  12. Naldi MC, Carvalho AC, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289

    Article  MathSciNet  MATH  Google Scholar 

  13. Ni Z, Wu X, Ni L, Tang L, Xiao H (2015) The research on selective clustering ensemble algorithm based on fractal dimension and projection. Journal of Computational Information Systems 11(11):4025–4035

    Google Scholar 

  14. X. Wang, D. Han, C. Han, Rough set based cluster ensemble selection, information FUSION (FUSION), 2013a

    Google Scholar 

  15. Yang F, Li T, Zhou Q, Xiao H (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70

    Article  Google Scholar 

  16. Yousefnezhad M, Reihanian A, Zhang D, Minaei-Bidgoli B (2016) A new selection strategy for selective cluster ensemble based on diversity and independency. Eng Appl Artif Intell 56(C):260–272

    Article  Google Scholar 

  17. Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WF (2013) Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41(1):27–48

  18. Yu Z, Chen H, You J, Wong HS (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, IEEE/ACM transactions on computational biology. Bioinformatics 11(4):727–740

    Google Scholar 

  19. Kao LJ, Huang YP (2013) Ejecting outliers to enhance robustness of fuzzy cluster ensemble. In: IEEE international conference on systems, man, and cybernetics, pp 3790–3795

    Chapter  Google Scholar 

  20. Mishra SP, Mishra D, Patnaik S (2015) An integrated robust semi-supervised framework for improving cluster reliability using ensemble method for heterogeneous datasets. Karbala International Journal of Modern Science 1(4):200–211

    Article  Google Scholar 

  21. Akbari E, Dahlan HM, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39(39):146–156

    Article  Google Scholar 

  22. H. Wang, J. Qi, W. Zheng, M. Wang, Semi-supervised cluster ensemble based on binary similarity matrix, in: The IEEE International Conference on Information Management and Engineering, 2010, pp. 251–254

  23. Alizadeh H, Minaei B, Parvin H (2013) Optimizing fuzzy cluster Ensemble in String Representation. International Journal of Pattern Recognition and Artificial Intelligence, IJPRAI, ISSN:0218–0014

  24. Meng J, Hao H, Luan Y (2016) Classifier ensemble selection based on affinity propagation clustering. J Biomed Inform 60:234–242

    Article  Google Scholar 

  25. Soltanmohammadi E, Naraghi-Pour M, Schaar MVD (2016) Context-based unsupervised ensemble learning and feature ranking. Mach Learn 105(3):1–27

    Article  MathSciNet  MATH  Google Scholar 

  26. Wang D, Li L, Yu Z, Wang X (2013b) AP2CE: double affinity propagation based cluster ensemble. In: International conference on machine learning and cybernetics, pp 16–23

    Google Scholar 

  27. Yu Z, Luo P, You J, Wong HS, Leung H, Wu S, Zhang J, Han G (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Transactions on Knowledge & Data Engineering 28(3):701–714

    Article  Google Scholar 

  28. Iam-On N, Boongoen T, Garrett S, Price C (2011) A link-based approach to the cluster ensemble problem. IEEE transactions on Pattern Analysis & Machine Intelligence 33(12):2396–2409

    Article  Google Scholar 

  29. Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709

    Article  MATH  Google Scholar 

  30. Wang LJ, Hao ZF, Cai RC, Wen W (2014) An improved local adaptive clustering ensemble based on link analysis. In: International conference on machine learning and cybernetics, pp 10–15

    Chapter  Google Scholar 

  31. Xiao W, Yang Y, Wang H, Li T, Xing H (2016) Semi-supervised hierarchical clustering ensemble and its application. Neurocomputing 173:1362–1376

    Article  Google Scholar 

  32. Wang W (2008) Some fundamental issues in ensemble methods. In: proceedings of the IEEE international joint conference on neural networks, IEEE world congress on. Comput Intell:2243–2250

  33. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM computing surveys (CSUR) 31(3):264–323

    Article  Google Scholar 

  34. Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. In: Pearson Addison Wesley (Boston)

    Google Scholar 

  35. Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3:32–57

    Article  MathSciNet  MATH  Google Scholar 

  36. Berikov VB (2018) A probabilistic model of fuzzy clustering ensemble. Pattern Recognition and Image Analysis 28(1):1–10. https://doi.org/10.1134/S1054661818010029

    Article  Google Scholar 

  37. Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912

    Article  MATH  Google Scholar 

  38. Li T, Chen Y (2010) Fuzzy clustering ensemble with selection of number of clusters. JCP 5(7):1112–1119

    Google Scholar 

  39. Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2018) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Applic. https://doi.org/10.1007/s10044-017-0676-x

  40. Pan S, Changjing S, Qiang S (2015) A hierarchical fuzzy cluster ensemble approach and its application to big data clustering. Journal of Intelligent & Fuzzy Systems 28(6):2409–2421

    Article  MathSciNet  Google Scholar 

  41. Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112

    Article  MathSciNet  MATH  Google Scholar 

  42. Sevillano X, JC S’o, Alıas F (2009) Fuzzy clusterers combination by positional voting for robust document clustering. Procesamiento del lenguaje natural 43:245–253

    Google Scholar 

  43. Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA), IEEE. ACM, pp 149–155

  44. Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning, pp 186–193, URL http://www.aaai.org/Papers/ICML/2003/ICML03–027.pdf

  45. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    MATH  Google Scholar 

  46. Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: Proceedings of the international conference on information technology: coding and computing ITCC, IEEE, vol 2, pp 188–192

    Google Scholar 

  47. Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013) Data weighing mechanisms for clustering ensembles. Computers & Electrical Engineering 39(5):1433–1450

    Article  Google Scholar 

  48. Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503

    Article  Google Scholar 

  49. Topchy A, Jain AK, Punch W (2004) A mixture model of clustering ensembles. Proceedings of the SIAM International Conference of Data Mining, In

    Book  Google Scholar 

  50. Iam-on N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519

    Article  Google Scholar 

  51. Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425

    Article  Google Scholar 

  52. Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE:1176–1181

  53. Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1):4–es

    Article  Google Scholar 

  54. Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In: Intelligent data engineering and automated learning (IDEAL). Springer, pp 142–149

  55. Fred AL, Jain AK (2005) Combining multiple clustering's using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850

    Article  Google Scholar 

  56. Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Transactions on Neural Networks and Learning Systems 27(5):952–965

    Article  MathSciNet  Google Scholar 

  57. Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388-389:37–47

    Article  Google Scholar 

  58. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392

    Article  MathSciNet  MATH  Google Scholar 

  59. Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st International Conference on Machine learning, ACM, p 36

  60. Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. The IEEE Transactions on Knowledge and Data Engineering, Robust Ensemble Clustering Using Probability Trajectories

  61. Huang D, Lai J, Wang CD (2016a) Ensemble clustering using factor graph. Pattern Recogn 50:131–142

    Article  MATH  Google Scholar 

  62. Houle ME (2008) The relevant-set correlation model for data clustering. Statistical Analysis and Data Mining 1(3):157–176

    Article  MathSciNet  Google Scholar 

  63. Vinh NX, Houle ME (2010) A set correlation model for partitional clustering. Advances in Knowledge Discovery and Data Mining, Springer, In, pp 4–15

    Google Scholar 

  64. D. Dueck, “Affinity propagation: Clustering data by passing messages,” Ph.D. dissertation, University of Toronto, 2009

  65. Newman CBDJ, SS Hettich, C Merz (1998) UCI repository of Mach Learn databases, http://www.ics.uci.edu/˜mlearn/MLSummary.html, (1998)

  66. Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In: Proceedings of the IEEE 13th International Conference on Data Mining (ICDM), IEEE, pp 627–636

  67. Mimaroglu S, Aksehirli E (2012) DICLENS: divisive clustering ensemble with automatic cluster number. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 9(2):408–420

    Article  Google Scholar 

  68. Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) Cluster ensemble selection based on a new cluster stability measure. Intelligent Data Analysis 18(3):389–408

    Article  Google Scholar 

  69. Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250

    Article  Google Scholar 

  70. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This paper is extracted from a PhD thesis written by Musa Mojarad.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samad Nejatian.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mojarad, M., Nejatian, S., Parvin, H. et al. A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters. Appl Intell 49, 2567–2581 (2019). https://doi.org/10.1007/s10489-018-01397-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-01397-x

Keywords

Navigation