Skip to main content
Log in

A multi-level consensus function clustering ensemble

  • Foundations
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In order to improve the performance of a clustering on a data set, a number of primary partitions are generated and stored in an ensemble and their aggregated consensus partition is used as their clustering. It is widely accepted that the consensus partition outperforms the primary partitions. In this paper, an ensemble clustering method called multi-level consensus clustering (MLCC) is proposed. To construct the MLCC, a cluster–cluster similarity matrix which is achieved by an innovative similarity metric is first generated. The mentioned cluster–cluster similarity matrix is based on a multi-level similarity metric. In fact, it can be computed in a new defined multi-level space. Then, a point–point similarity matrix which is boosted using the mentioned cluster–cluster similarity matrix is generated. The new consensus function applies an average linkage hierarchical clusterer algorithm on the mentioned point–point similarity matrix to make consensus partition. MLCC is better than traditional clustering ensembles and simple versions of clustering ensembles on traditional cluster–cluster similarity matrix. Its computational cost is not very bad too. Accuracy and robustness of the proposed method are compared with those of the modern clustering algorithms through the experimental tests. Also, time analysis is presented in the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7.
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Abbasi S, Nejatian S et al (2019) Clustering ensemble selection considering quality and diversity. Artif Intell Rev 52:1311–1340

    Article  Google Scholar 

  • Akbari E, Mohamed Dahlan H, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl AI 39:146–156

    Article  Google Scholar 

  • AleAhmad A, Amiri H, Darrudi E, Rahgozar M, Oroumchian F (2009) Hamshahri: a standard Persian text collection. J Knowl-Based Syst 22(5):382–387

    Article  Google Scholar 

  • Alishvandi H, Gouraki GH, Parvin H (2016) An enhanced dynamic detection of possible invariants based on best permutation of test cases. Comput Syst Sci Eng 31(1):53–61

    Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2011a) A new criterion for clusters validation. In: Artificial intelligence applications and innovations (AIAI 2011), IFIP, Springer, Heidelberg, Part I, pp 240–246

  • Alizadeh H, Minaei-Bidgoli B, Parvin H, Moshki M (2011b) An asymmetric criterion for cluster validation, developing concepts in applied intelligence. Stud Comput Intell 363:1–14

    Article  Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) Optimizing fuzzy cluster ensemble in string representation. Int J Pattern Recognit Artif Intell 27(2):1350005

    Article  MathSciNet  Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26(1):127–150

    Article  Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18(3):389–408

    Article  Google Scholar 

  • Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503

    Article  Google Scholar 

  • Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In Intelligent data engineering and automated learning (IDEAL), Springer, pp 142–149

  • Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA), IEEE/ACM, pp 149–155

  • Ayad HG, Kamel MS (2008) Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters. IEEE Trans Pattern Anal Mach Intell 30(1):160–173

    Article  Google Scholar 

  • Bagherinia A, Minaei-Bidgoli B, Hossinzadeh M, Parvin H (2019) Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Appl Intell 49(5):1724–1747

    Article  Google Scholar 

  • Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388–389:37–47

    Article  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    Article  MathSciNet  MATH  Google Scholar 

  • Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912

    Article  MATH  Google Scholar 

  • Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Disc Data (TKDD) 2(4):1–42

    Article  Google Scholar 

  • Dueck D (2009) Affinity propagation: clustering data by passing messages. Ph.D. dissertation, University of Toronto

  • Faceli K, Marcilio CP, Souto D (2006) Multi-objective clustering ensemble. In: Proceedings of the sixth international conference on hybrid intelligent systems

  • X. Z. Fern and C. E. Brodley, “Random projection for high dimensional data clustering: A cluster ensemble approach”, In: Proceedings of the 20th International Conference on Machine Learning, (2003), pp. 186–193.

  • Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st international conference on machine learning, ACM, p 36

  • Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47(2):833–842

    Article  MATH  Google Scholar 

  • Fred A, Jain AK (2002) Data clustering using evidence accumulation. In: Intl. conf. on pattern recognition, ICPR02, Quebec City, pp 276–280

  • Fred A, Jain AK (2005) Combining multiple clustering’s using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850

    Article  Google Scholar 

  • Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. Comput Learn Theory 55:119–139

    MathSciNet  MATH  Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  MathSciNet  MATH  Google Scholar 

  • Ghaemi R, ben Sulaiman N, Ibrahim H, Mustapha N (2011) A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif Intell Rev 35(4):287–318

    Article  Google Scholar 

  • Ghosh J, Acharya A (2011) Cluster ensembles. Data Min Knowl Disc 1(4):305–315

    Article  Google Scholar 

  • Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4

    Article  Google Scholar 

  • Hanczar B, Nadif M (2012) Ensemble methods for biclustering tasks. Pattern Recognit 45(11):3938–3949

    Article  Google Scholar 

  • Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol. 1, pp 278–282. https://doi.org/10.1109/ICDAR.1995.598994

  • Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn 41(9):2742–2756

    Article  MATH  Google Scholar 

  • Hosseinpoor MJ, Parvin H, Nejatian S, Rezaie V (2019) Gene regulatory elements extraction in breast cancer by Hi-C data using a meta-heuristic method. Russ J Genet 55(9):1152–1164

    Article  Google Scholar 

  • Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250

    Article  Google Scholar 

  • Huang D, Lai J, Wang CD (2016) Ensemble clustering using factor graph. Pattern Recogn 50:131–142

    Article  MATH  Google Scholar 

  • Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. The IEEE Trans Knowl Data Eng

  • Huang D, Wang CD, Lai JH (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 99:1–14. https://doi.org/10.1109/TCYB.2017.2702343

    Article  Google Scholar 

  • N. Iam-On, T. Boongoen and S. M. Garrett, “Refining Pairwise Similarity Matrix for Cluster Ensemble Problem with Cluster Relations”, Discovery Science, (2008), 222–233.

  • Iam-On N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519

    Article  Google Scholar 

  • Iam-On N, Boongoen T, Garrett S, Price C (2011) A link based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33(12):2396–2409

    Article  Google Scholar 

  • Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425

    Article  Google Scholar 

  • Jamalinia H, Khalouei S, Rezaie V, Nejatian S, Bagheri-Fard K, Parvin H (2018) Diverse classifier ensemble creation based on heuristic dataset modification. J Appl Stat 45(7):1209–1226

    Article  MathSciNet  Google Scholar 

  • Jenghara MM, Ebrahimpour-Komleh H, Parvin H (2018a) Dynamic protein–protein interaction networks construction using firefly algorithm. Pattern Anal Appl 21(4):1067–1081

    Article  MathSciNet  Google Scholar 

  • Jenghara MM, Ebrahimpour-Komleh H, Rezaie V, Nejatian S, Parvin H, Syed-Yusof SK (2018b) Imputing missing value through ensemble concept based on statistical measures. Knowl Inf Syst 56(1):123–139

    Article  Google Scholar 

  • Jiang Y, Chung FL, Wang S, Deng Z, Wang J, Qian P (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701

    Article  Google Scholar 

  • Mimaroglu S, Aksehirli E (2012) DICLENS: Divisive clustering ensemble with automatic cluster number. IEEE/ACM Trans Comput Biol Bioinf 9(2):408–420

    Article  Google Scholar 

  • Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: Intl. conf. on information technology, ITCC 04, Las Vegas, pp 188–192

  • Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WE (2014) Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41(1):27–48

    Article  Google Scholar 

  • Mirzaei A, Rahmati M (2010) A Novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27–39

    Article  Google Scholar 

  • Mojarad M, Parvin H, Nejatian S, Rezaie V (2019a) Consensus function based on clusters clustering and iterative fusion of base clusters. Int J Uncertain Fuzziness Knowl-Based Syst 27(1):97–120

    Article  Google Scholar 

  • Mojarad M, Nejatian S, Parvin H, Mohammadpoor M (2019b) A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters. Appl Intell 49(7):2567–2581

    Article  Google Scholar 

  • Moradi M, Nejatian S, Parvin H, Rezaie V (2018) CMCABC: Clustering and memory-based chaotic artificial bee colony dynamic optimization algorithm. Int J Inf Technol Decis Mak 17(04):1007–1046

    Article  Google Scholar 

  • Naldi MC, De Carvalho ACM, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289

    Article  MathSciNet  MATH  Google Scholar 

  • Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2019) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Appl 22(1):133–145

    Article  MathSciNet  Google Scholar 

  • Nejatian S, Parvin H, Faraji E (2018) Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification. Neurocomputing 276:55–66

    Article  Google Scholar 

  • Nejatian S, Rezaie V, Parvin H, Pirbonyeh M, Bagherifard K, Yusof SKS (2019) An innovative linear unsupervised space adjustment by keeping low-level spatial data structure. Knowl Inf Syst 59(2):437–464

    Article  Google Scholar 

  • Newman CBDJ, Hettich SS, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/˜mlearn/MLSummary.html.

  • Omidvar MN, Nejatian S, Parvin H, Rezaie V (2018) A new natural-inspired continuous optimization approach, Journal of Intelligent & Fuzzy Systems, 1–17,

  • Partabian J, Rafe V, Parvin H, Nejatian S (2020) An approach based on knowledge exploration for state space management in checking reachability of complex software systems. Soft Comput 24(10):7181–7196

    Article  Google Scholar 

  • H. Parvin, B. Minaei-Bidgoli, A clustering ensemble framework based on elite selection of weighted clusters, Advances in Data Analysis and Classification (2013) 1–28.

  • Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112

    Article  MathSciNet  MATH  Google Scholar 

  • Parvin H, Beigi A, Mozayani N (2012) A clustering ensemble learning method based on the ant colony clustering algorithm. Int J Appl Comput Math 11(2):286–302

    MathSciNet  Google Scholar 

  • Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013) Data weighing mechanisms for clustering ensembles. Comput Electr Eng 39(5):1433–1450

    Article  Google Scholar 

  • Parvin H, Nejatian S, Mohamadpour M (2018) Explicit memory based ABC with a clustering strategy for updating and retrieval of memory in dynamic environments. Appl Intell 48(11):4317–4337

    Article  Google Scholar 

  • Pirbonyeh A, Rezaie V, Parvin H, Nejatian S, Mehrabi M (2019) A linear unsupervised transfer learning by preservation of cluster-and-neighborhood data organization. Pattern Anal Appl 22(3):1149–1160

    Article  MathSciNet  Google Scholar 

  • Rafiee G, Dlay SS, Woo WL (2013) Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches. Pattern Recognit 46(10):2685–2699

    Article  Google Scholar 

  • Rashidi F, Nejatian S, Parvin H, Rezaie V (2019) Diversity based cluster weighting in cluster ensemble: an information theory approach. Artif Intell Rev 52(2):1341–1368

    Article  Google Scholar 

  • Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In Proceedings of the IEEE 13th international conference on data mining (ICDM), IEEE, pp 627–636

  • Roth V, Lange T, Braun M, Buhmann J (2002) A resampling approach to cluster validation. Intl. conf. on computational statistics, COMPSTAT

  • Shabaniyan T, Parsaei H, Aminsharifi A, Movahedi MM, Jahromi AT, Pouyesh S, Parvin H (2019) An artificial intelligence-based clinical decision support system for large kidney stone treatment. Australas Phys Eng Sci Med 42(3):771–779

    Article  Google Scholar 

  • Shahriari A, Parvin H, Monajati A (2015) Exploring weights of hierarchical and equivalency relationship in general Persian texts. EANN Workshops 7(1):7

    Google Scholar 

  • Soto V, Garcia-Moratilla S, Martinez-Munoz G, Hernandez- Lobato D, Suarez A (2014) A double pruning scheme for boosting ensembles. IEEE Trans Cybern 44(12):2682–2695

    Article  Google Scholar 

  • Strehl A, Ghosh J (2000) Value-based customer grouping from large retail data sets. In AeroSense, International Society for Optics and Photonics, pp 33–42

  • Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  • Szetoa PM, Parvin H, Mahmoudi MR, Tuan BA, Pho KH (2020) Deep neural network as deep feature learner. J Intell Fuzzy Syst. https://doi.org/10.3233/JIFS-191292

    Article  Google Scholar 

  • Topchy AP, Jain AK, Punch WF (2003) Combining multiple weak clusterings. In: IEEE international conference on data mining, pp 331–338

  • Topchy A, Jain AK, Punch W (2005) A mixture model of clustering ensembles. Proc SIAM Int Conf Data Min, Citeseer 27(12):1866–1881

    Google Scholar 

  • N. X. Vinh and M. E. Houle, “A set correlation model for partitional clustering”, In: Advances in Knowledge Discovery and Data Mining, Springer, (2010) pp. 4–15.

  • Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Trans Neural Netw Learn Syst 27(5):952–965

    Article  MathSciNet  Google Scholar 

  • Yasrebi M, Eskandar-Baghban A, Parvin H, Mohammadpour M (2018) Optimisation inspiring from behaviour of raining in nature: droplet optimisation algorithm. Int J Bio-Inspired Comput 12(3):152–163

    Article  Google Scholar 

  • Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: Proceedings of the IEEE 12th international conference on data mining (ICDM), IEEE, pp 1176–1181

  • Yousefnezhad M, Huang SJ, Zhang D (2018) WoCE: a framework for clustering ensemble by exploiting the wisdom of crowds theory. IEEE Trans Cybernetics 48(2):486–499

    Article  Google Scholar 

  • Yu Z, Wong HS, You J, Yang Q, Liao H (2011) Knowledge based cluster ensemble for cancer discovery from biomolecular data. IEEE Trans Nanobiosci 10(2):76–85

    Article  Google Scholar 

  • Yu Z, You J, Wong HS, Han G (2012) From cluster ensemble to structure ensemble. Inf Sci 198:81–99

    Article  MATH  Google Scholar 

  • Yu Z, Chen H, You J, Han G, Li L (2013) Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular data. IEEE/ACM Trans Comput Biol Bioinf 10(3):657–670

    Article  Google Scholar 

  • Yu Z, Li L, Liu J, Han G (2015) Hybrid Adaptive Classifier Ensemble. IEEE Transactions on Cybernetics 45(2):177–190

    Article  Google Scholar 

  • Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G (2016a) Distribution-based cluster structure selection. IEEE Trans Cybern 99:1–14. https://doi.org/10.1109/TCYB.2016.2569529

    Article  Google Scholar 

  • Yu Z, Chen H, Liu J, You J, Leung H, Han G (2016b) Hybrid k-nearest neighbor classifier. IEEE Trans Cybern 46(6):1263–1275

    Article  Google Scholar 

  • Yu Z, Lu Y, Zhang J, You J, Wong HS, Wang Y, Han G (2017) Progressive semisupervised learning of multiple classifiers. IEEE Trans Cybern 99:1–14

    Google Scholar 

  • Zhang S, Wong HS, Shen Y (2012) Generalized adjusted rand indices for cluster ensembles. Pattern Recognit 45(6):2214–2226

    Article  MATH  Google Scholar 

  • Zhao X, Liang J, Dang C (2017) Clustering ensemble selection for categorical data based on internal validity indices. Pattern Recognit 69:150–168

    Article  Google Scholar 

  • Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hamid Parvin or Hamid Alinejad-Rokny.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pho, KH., Akbarzadeh, H., Parvin, H. et al. A multi-level consensus function clustering ensemble. Soft Comput 25, 13147–13165 (2021). https://doi.org/10.1007/s00500-021-06092-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-06092-7

Keywords

Navigation