Skip to main content
Log in

Consensus function based on cluster-wise two level clustering

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The ensemble clustering tries to aggregate a number of basic clusterings with the aim of producing a more consistent, robust and well-performing consensus clustering result. The current paper wants to introduce an ensemble clustering method. The proposed method, called consensus function based on two level clustering (CFTLC), introduces a new consensus clustering where it makes a cluster clustering task through applying an average hierarchical clustering on a cluster–cluster similarity matrix obtained by an innovative similarity metric. By applying the average hierarchical clustering algorithm, a set of meta clusters has been attained. Considering each meta cluster as a consensus cluster in the consensus clustering output, it then assigns each data point to a meta cluster through defining an object-cluster similarity. Before doing anything, CFTLC converts the primary partitions into a binary cluster representation where the primary ensemble has been broken into a number of basic binary clusters (BC). CFTLC first combines the basic BCs with the maximum cluster–cluster similarity. This step is iterated as long as a predefined number of meta clusters are ready. At the subsequent step, it assigns each data point to exactly one meta cluster. The proposed method has been experimentally compared with the state of the art clustering algorithms in terms of accuracy and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Akbari E, Mohamed Dahlan H, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl AI 39:146–156

    Google Scholar 

  • AleAhmad A, Amiri H, Darrudi E, Rahgozar M, Oroumchian F (2009) Hamshahri: a standard Persian text collection. J Knowl Based Syst 22(5):382–387

    Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2011a) A new criterion for clusters validation. In: Artificial intelligence applications and innovations (AIAI 2011), IFIP. Springer, Heidelberg, Part I, pp 240–246

  • Alizadeh H, Minaei-Bidgoli B, Parvin H, Moshki M (2011b) An asymmetric criterion for cluster validation, developing concepts in applied intelligence. Stud Comput Intell 363:1–14

    Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) “Optimizing fuzzy cluster ensemble in string representation. IJPRAI 27(2):1350005

    MathSciNet  Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18(3):389–408

    Google Scholar 

  • Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26(1):127–150

    Google Scholar 

  • Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503

    Google Scholar 

  • Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In: Intelligent data engineering and automated learning (IDEAL). Springer, pp 142–149

  • Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA). IEEE/ACM, pp 149–155

  • Alsaaideh B, Tateishi R, Phong DX, Hoan NT, Al-Hanbali A, Xiulian B (2017) New urban map of Eurasia using MODIS and multi-source geospatial data. Geo Spat Inf Sci 20(1):29–38

    Google Scholar 

  • Ayad HG, Kamel MS (2008) Cumulative voting consensus method for partitions with a variable number of clusters. IEEE Trans Pattern Anal Mach Intell 30(1):160–173

    Google Scholar 

  • Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388–389:37–47

    Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    MATH  Google Scholar 

  • Chakraborty D, Singh S, Dutta D (2017) Segmentation and classification of high spatial resolution images based on Hölder exponents and variance. Geo Spat Inf Sci 20(1):39–45

    Google Scholar 

  • Deng Q, Wu S, Wen J, Xu Y (2018) Multi-level image representation for large-scale image-based instance retrieval. CAAI Trans Intell Technol 3(1):33–39

    Google Scholar 

  • Derakhshani RR (2011) An ensemble method for classifying startle eyeblink modulation from high-speed video records. IEEE Trans Affect Comput 2(1):50–63

    Google Scholar 

  • Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912

    MATH  Google Scholar 

  • Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM TKDD 2(4):1–42

    Google Scholar 

  • Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, Hoboken

    MATH  Google Scholar 

  • Dueck D (2009) Affinity propagation: clustering data by passing messages. Ph.D. dissertation, University of Toronto

  • Faceli K, Marcilio CP, Souto D (2006) Multi-objective clustering ensemble. In: Proceedings of the sixth international conference on hybrid intelligent systems

  • Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th international conference on machine learning, pp 186–193. http://www.aaai.org/Papers/ICML/2003/ICML03-027.pdf

  • Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st international conference on machine learning, ACM, p 36

  • Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47(2):833–842

    MATH  Google Scholar 

  • Fred A, Jain AK (2002) Data clustering using evidence accumulation. In: International conference on pattern recognition, ICPR02, Quebec City, pp 276–280

  • Fred A, Jain AK (2005) Combining multiple clustering’s using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850

    Google Scholar 

  • Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. Computational learning theory, pp 119–139

  • Friedman JH (2011) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    MathSciNet  MATH  Google Scholar 

  • Ghaemi R, bin Sulaiman N, Ibrahim H, Mustapha N (2011) A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif Intell Rev 35(4):287–318

    Google Scholar 

  • Ghosh J, Acharya A (2011) Cluster ensembles. Data Min Knowl Disc 1(4):305–315

    Google Scholar 

  • Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM TKDD 1(1):4

    Google Scholar 

  • Hanczar B, Nadif M (2012) Ensemble methods for biclustering tasks. Pattern Recogn 45(11):3938–3949

    Google Scholar 

  • Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol 1, pp 278–282

  • Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn 41(9):2742–2756

    MATH  Google Scholar 

  • Houle ME (2008) The relevant-set correlation model for data clustering. Stat Anal Data Min 1(3):157–176

    MathSciNet  Google Scholar 

  • Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250

    Google Scholar 

  • Huang D, Lai J, Wang CD (2016a) Ensemble clustering using factor graph. Pattern Recogn 50:131–142

    MATH  Google Scholar 

  • Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. IEEE Trans Knowl Data Eng 28(5):1312–1326

    Google Scholar 

  • Huang D, Wang CD, Lai JH (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 99(1):1–14. https://doi.org/10.1109/TCYB.2017.2702343

    Article  Google Scholar 

  • Iam-On N, Boongoen T, Garrett SM (2008) Refining pairwise similarity matrix for cluster ensemble problem with cluster relations. Discovery Science, pp 222–233

  • Iam-On N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519

    Google Scholar 

  • Iam-On N, Boongoen T, Garrett S, Price C (2011) A link based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33(12):2396–2409

    Google Scholar 

  • Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425

    Google Scholar 

  • Jiang Y, Chung FL, Wang S, Deng Z, Wang J, Qian P (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701

    Google Scholar 

  • Li C, Zhang Y, Tu W et al (2017) Soft measurement of wood defects based on LDA feature fusion and compressed sensor images. J For Res 28(6):1285–1292

    Google Scholar 

  • Liang JY, Shi QY, Zhao XW (2018) Multi-view data ensemble clustering: a cluster-level perspective. Int J Mach Intell Sens Sig Process 2(2):97–120

    Google Scholar 

  • Ma J, Jiang X, Gong M (2018) Two-phase clustering algorithm with density exploring distance measure. CAAI Trans Intell Technol 3(1):59–64

    Google Scholar 

  • Mimaroglu S, Aksehirli E (2012) DICLENS: divisive clustering ensemble with automatic cluster number. IEEE ACM TCBB 9(2):408–420

    Google Scholar 

  • Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International conference on information technology, ITCC 04, Las Vegas, pp 188–192

  • Mirzaei A, Rahmati M (2010) A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27–39

    Google Scholar 

  • Mirzaei A, Rahmati M, Ahmadi M (2008) A new method for hierarchical clustering combination. Intell Data Anal 12(6):549–571

    Google Scholar 

  • Naldi MC, De Carvalho ACM, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289

    MathSciNet  MATH  Google Scholar 

  • Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2017) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Appl. https://doi.org/10.1007/s10044-017-0676-x

    Article  Google Scholar 

  • Newman CBDJ, Hettich SS, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLSummary.html

  • Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112

    MathSciNet  MATH  Google Scholar 

  • Parvin H, Alinejad-Rokny H, Minaei-Bidgoli B, Parvin S (2013a) A new classifier ensemble methodology based on subspace learning. J Exp Theor Artif Intell 25(2):227–250

    MATH  Google Scholar 

  • Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013b) Data weighing mechanisms for clustering ensembles. Comput Electr Eng 39(5):1433–1450

    Google Scholar 

  • Pattanasri N (2012) Learning to estimate slide comprehension in classrooms with support vector machines. IEEE Trans Learn Technol 5(1):52–61

    Google Scholar 

  • Rafiee G, Dlay SS, Woo WL (2013) Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches. Pattern Recogn 46(10):2685–2699

    Google Scholar 

  • Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In: Proceedings of the IEEE 13th international conference on data mining (ICDM). IEEE, pp 627–636

  • Roth V, Lange T, Braun M, Buhmann J (2002) A resampling approach to cluster validation. In: International conference on computational statistics, COMPSTAT

  • Shahriari A, Parvin H, Monajati A (2015) Exploring weights of hierarchical and equivalency relationship in general persian texts. EANN Workshops 7(1):7

    Google Scholar 

  • Song XP, Huang C, Townshend JR (2017) Improving global land cover characterization through data fusion. Geo Spat Inf Sci 20(2):141–150

    Google Scholar 

  • Soto V, Garcia-Moratilla S, Martinez-Munoz G, Hernandez-Lobato D, Suarez A (2014) A double pruning scheme for boosting ensembles. IEEE Trans Cybern 44(12):2682–2695

    Google Scholar 

  • Strehl A, Ghosh J (2000) Value-based customer grouping from large retail data sets. In: AeroSense, international society for optics and Photonics, pp 33–42

  • Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  • Topchy AP, Jain AK, Punch WF (2003) Combining multiple weak clusterings. In: IEEE international conference on data mining, pp 331–338

  • Topchy A, Jain AK, Punch W (2005) A mixture model of clustering ensembles. In: Proceedings of the SIAM international conference of data mining, Citeseer, vol 27, no 12, pp 1866–1881

  • Vinh NX, Houle ME (2010) A set correlation model for partitional clustering. In: Advances in knowledge discovery and data mining. Springer, pp 4–15

  • Wagner J (2011) Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Trans Affect Comput 2(4):206–218

    Google Scholar 

  • Wang B, Zhang J, Liu Y, Zou Y (2017) Density peaks clustering based integrate framework for multi-document summarization. CAAI Trans Intell Technol 2(1):26–30

    Google Scholar 

  • Wu CH (2011) Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans Affect Comput 2(1):10–21

    Google Scholar 

  • Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Trans Neural Netw Learn Syst 27(5):952–965

    MathSciNet  Google Scholar 

  • Yang H, Yu L (2017) Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J For Res 28(2):395–402

    Google Scholar 

  • Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 1176–1181

  • Yousefnezhad M, Huang SJ, Zhang D (2018) WoCE: a framework for clustering ensemble by exploiting the wisdom of crowds theory. IEEE Trans Cybern 48(2):486–499

    Google Scholar 

  • Yu Z, Wong HS, You J, Yang Q, Liao H (2011) Knowledge based cluster ensemble for cancer discovery from biomolecular data. IEEE Trans Nanobiosci 10(2):76–85

    Google Scholar 

  • Yu Z, You J, Wong HS, Han G (2012) From cluster ensemble to structure ensemble. Inf Sci 198:81–99

    MATH  Google Scholar 

  • Yu Z, Chen H, You J, Han G, Li L (2013) Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular Data. IEEE ACM Trans Comput Biol Bioinform 10(3):657–670

    Google Scholar 

  • Yu Z, Li L, Liu J, Han G (2015) Hybrid adaptive classifier ensemble. IEEE Trans Cybern 45(2):177–190

    Google Scholar 

  • Yu Z, Chen H, Liu J, You J, Leung H, Han G (2016a) Hybrid k-nearest neighbor classifier. IEEE Trans Cybern 46(6):1263–1275

    Google Scholar 

  • Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G (2016b) Distribution- based cluster structure selection. IEEE Trans Cybern 99(1):1–14. https://doi.org/10.1109/TCYB.2016.2569529

    Article  Google Scholar 

  • Yu Z, Lu Y, Zhang J, You J, Wong HS, Wang Y, Han G (2017) Progressive semisupervised learning of multiple classifiers. IEEE Trans Cybern 99(1):1–14

    Google Scholar 

  • Zhang S, Wong HS, Shen Y (2012) Generalized adjusted rand indices for cluster ensembles. Pattern Recogn 45(6):2214–2226

    MATH  Google Scholar 

  • Zhao X, Liang J, Dang C (2017) Clustering ensemble selection for categorical data based on internal validity indices. Pattern Recogn 69:150–168

    Google Scholar 

  • Zhao XW, Cao FY, Liang JY (2018) A sequential ensemble clusterings generation algorithm for mixed data. Appl Math Comput 335:264–277

    MathSciNet  MATH  Google Scholar 

  • Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709

    MATH  Google Scholar 

Download references

Acknowledgements

This study is supported by grants to HAR and HP. HAR is supported by UNSW Scientia Program Fellowship and is a member of the UNSW Graduate School of Biomedical Engineering.

Author information

Authors and Affiliations

Authors

Contributions

HP and HAR designed the study. HP and HAR developed hypothesis and experiments. MRM and HP wrote the manuscript; MRM, HP, and HA edited the manuscript with help from HAR; MRM, HP, HA, SN, and VR carried out the analyses, the implementation of the codes, and the statistical analyses; MRM, HP, and HA generated all figures and tables. All authors have read and approved the final version of the paper.

Corresponding authors

Correspondence to Hamid Parvin or Hamid Alinejad-Rokny.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahmoudi, M.R., Akbarzadeh, H., Parvin, H. et al. Consensus function based on cluster-wise two level clustering. Artif Intell Rev 54, 639–665 (2021). https://doi.org/10.1007/s10462-020-09862-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09862-1

Keywords

Navigation