Abstract
The ensemble clustering tries to aggregate a number of basic clusterings with the aim of producing a more consistent, robust and well-performing consensus clustering result. The current paper wants to introduce an ensemble clustering method. The proposed method, called consensus function based on two level clustering (CFTLC), introduces a new consensus clustering where it makes a cluster clustering task through applying an average hierarchical clustering on a cluster–cluster similarity matrix obtained by an innovative similarity metric. By applying the average hierarchical clustering algorithm, a set of meta clusters has been attained. Considering each meta cluster as a consensus cluster in the consensus clustering output, it then assigns each data point to a meta cluster through defining an object-cluster similarity. Before doing anything, CFTLC converts the primary partitions into a binary cluster representation where the primary ensemble has been broken into a number of basic binary clusters (BC). CFTLC first combines the basic BCs with the maximum cluster–cluster similarity. This step is iterated as long as a predefined number of meta clusters are ready. At the subsequent step, it assigns each data point to exactly one meta cluster. The proposed method has been experimentally compared with the state of the art clustering algorithms in terms of accuracy and robustness.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Akbari E, Mohamed Dahlan H, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl AI 39:146–156
AleAhmad A, Amiri H, Darrudi E, Rahgozar M, Oroumchian F (2009) Hamshahri: a standard Persian text collection. J Knowl Based Syst 22(5):382–387
Alizadeh H, Minaei-Bidgoli B, Parvin H (2011a) A new criterion for clusters validation. In: Artificial intelligence applications and innovations (AIAI 2011), IFIP. Springer, Heidelberg, Part I, pp 240–246
Alizadeh H, Minaei-Bidgoli B, Parvin H, Moshki M (2011b) An asymmetric criterion for cluster validation, developing concepts in applied intelligence. Stud Comput Intell 363:1–14
Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) “Optimizing fuzzy cluster ensemble in string representation. IJPRAI 27(2):1350005
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18(3):389–408
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26(1):127–150
Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503
Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In: Intelligent data engineering and automated learning (IDEAL). Springer, pp 142–149
Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA). IEEE/ACM, pp 149–155
Alsaaideh B, Tateishi R, Phong DX, Hoan NT, Al-Hanbali A, Xiulian B (2017) New urban map of Eurasia using MODIS and multi-source geospatial data. Geo Spat Inf Sci 20(1):29–38
Ayad HG, Kamel MS (2008) Cumulative voting consensus method for partitions with a variable number of clusters. IEEE Trans Pattern Anal Mach Intell 30(1):160–173
Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388–389:37–47
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Chakraborty D, Singh S, Dutta D (2017) Segmentation and classification of high spatial resolution images based on Hölder exponents and variance. Geo Spat Inf Sci 20(1):39–45
Deng Q, Wu S, Wen J, Xu Y (2018) Multi-level image representation for large-scale image-based instance retrieval. CAAI Trans Intell Technol 3(1):33–39
Derakhshani RR (2011) An ensemble method for classifying startle eyeblink modulation from high-speed video records. IEEE Trans Affect Comput 2(1):50–63
Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM TKDD 2(4):1–42
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, Hoboken
Dueck D (2009) Affinity propagation: clustering data by passing messages. Ph.D. dissertation, University of Toronto
Faceli K, Marcilio CP, Souto D (2006) Multi-objective clustering ensemble. In: Proceedings of the sixth international conference on hybrid intelligent systems
Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th international conference on machine learning, pp 186–193. http://www.aaai.org/Papers/ICML/2003/ICML03-027.pdf
Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st international conference on machine learning, ACM, p 36
Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47(2):833–842
Fred A, Jain AK (2002) Data clustering using evidence accumulation. In: International conference on pattern recognition, ICPR02, Quebec City, pp 276–280
Fred A, Jain AK (2005) Combining multiple clustering’s using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. Computational learning theory, pp 119–139
Friedman JH (2011) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Ghaemi R, bin Sulaiman N, Ibrahim H, Mustapha N (2011) A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif Intell Rev 35(4):287–318
Ghosh J, Acharya A (2011) Cluster ensembles. Data Min Knowl Disc 1(4):305–315
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM TKDD 1(1):4
Hanczar B, Nadif M (2012) Ensemble methods for biclustering tasks. Pattern Recogn 45(11):3938–3949
Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol 1, pp 278–282
Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn 41(9):2742–2756
Houle ME (2008) The relevant-set correlation model for data clustering. Stat Anal Data Min 1(3):157–176
Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250
Huang D, Lai J, Wang CD (2016a) Ensemble clustering using factor graph. Pattern Recogn 50:131–142
Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. IEEE Trans Knowl Data Eng 28(5):1312–1326
Huang D, Wang CD, Lai JH (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 99(1):1–14. https://doi.org/10.1109/TCYB.2017.2702343
Iam-On N, Boongoen T, Garrett SM (2008) Refining pairwise similarity matrix for cluster ensemble problem with cluster relations. Discovery Science, pp 222–233
Iam-On N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519
Iam-On N, Boongoen T, Garrett S, Price C (2011) A link based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33(12):2396–2409
Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425
Jiang Y, Chung FL, Wang S, Deng Z, Wang J, Qian P (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701
Li C, Zhang Y, Tu W et al (2017) Soft measurement of wood defects based on LDA feature fusion and compressed sensor images. J For Res 28(6):1285–1292
Liang JY, Shi QY, Zhao XW (2018) Multi-view data ensemble clustering: a cluster-level perspective. Int J Mach Intell Sens Sig Process 2(2):97–120
Ma J, Jiang X, Gong M (2018) Two-phase clustering algorithm with density exploring distance measure. CAAI Trans Intell Technol 3(1):59–64
Mimaroglu S, Aksehirli E (2012) DICLENS: divisive clustering ensemble with automatic cluster number. IEEE ACM TCBB 9(2):408–420
Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International conference on information technology, ITCC 04, Las Vegas, pp 188–192
Mirzaei A, Rahmati M (2010) A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27–39
Mirzaei A, Rahmati M, Ahmadi M (2008) A new method for hierarchical clustering combination. Intell Data Anal 12(6):549–571
Naldi MC, De Carvalho ACM, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289
Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2017) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Appl. https://doi.org/10.1007/s10044-017-0676-x
Newman CBDJ, Hettich SS, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLSummary.html
Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112
Parvin H, Alinejad-Rokny H, Minaei-Bidgoli B, Parvin S (2013a) A new classifier ensemble methodology based on subspace learning. J Exp Theor Artif Intell 25(2):227–250
Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013b) Data weighing mechanisms for clustering ensembles. Comput Electr Eng 39(5):1433–1450
Pattanasri N (2012) Learning to estimate slide comprehension in classrooms with support vector machines. IEEE Trans Learn Technol 5(1):52–61
Rafiee G, Dlay SS, Woo WL (2013) Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches. Pattern Recogn 46(10):2685–2699
Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In: Proceedings of the IEEE 13th international conference on data mining (ICDM). IEEE, pp 627–636
Roth V, Lange T, Braun M, Buhmann J (2002) A resampling approach to cluster validation. In: International conference on computational statistics, COMPSTAT
Shahriari A, Parvin H, Monajati A (2015) Exploring weights of hierarchical and equivalency relationship in general persian texts. EANN Workshops 7(1):7
Song XP, Huang C, Townshend JR (2017) Improving global land cover characterization through data fusion. Geo Spat Inf Sci 20(2):141–150
Soto V, Garcia-Moratilla S, Martinez-Munoz G, Hernandez-Lobato D, Suarez A (2014) A double pruning scheme for boosting ensembles. IEEE Trans Cybern 44(12):2682–2695
Strehl A, Ghosh J (2000) Value-based customer grouping from large retail data sets. In: AeroSense, international society for optics and Photonics, pp 33–42
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. J Mach Learn Res 3:583–617
Topchy AP, Jain AK, Punch WF (2003) Combining multiple weak clusterings. In: IEEE international conference on data mining, pp 331–338
Topchy A, Jain AK, Punch W (2005) A mixture model of clustering ensembles. In: Proceedings of the SIAM international conference of data mining, Citeseer, vol 27, no 12, pp 1866–1881
Vinh NX, Houle ME (2010) A set correlation model for partitional clustering. In: Advances in knowledge discovery and data mining. Springer, pp 4–15
Wagner J (2011) Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Trans Affect Comput 2(4):206–218
Wang B, Zhang J, Liu Y, Zou Y (2017) Density peaks clustering based integrate framework for multi-document summarization. CAAI Trans Intell Technol 2(1):26–30
Wu CH (2011) Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans Affect Comput 2(1):10–21
Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Trans Neural Netw Learn Syst 27(5):952–965
Yang H, Yu L (2017) Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J For Res 28(2):395–402
Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 1176–1181
Yousefnezhad M, Huang SJ, Zhang D (2018) WoCE: a framework for clustering ensemble by exploiting the wisdom of crowds theory. IEEE Trans Cybern 48(2):486–499
Yu Z, Wong HS, You J, Yang Q, Liao H (2011) Knowledge based cluster ensemble for cancer discovery from biomolecular data. IEEE Trans Nanobiosci 10(2):76–85
Yu Z, You J, Wong HS, Han G (2012) From cluster ensemble to structure ensemble. Inf Sci 198:81–99
Yu Z, Chen H, You J, Han G, Li L (2013) Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular Data. IEEE ACM Trans Comput Biol Bioinform 10(3):657–670
Yu Z, Li L, Liu J, Han G (2015) Hybrid adaptive classifier ensemble. IEEE Trans Cybern 45(2):177–190
Yu Z, Chen H, Liu J, You J, Leung H, Han G (2016a) Hybrid k-nearest neighbor classifier. IEEE Trans Cybern 46(6):1263–1275
Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G (2016b) Distribution- based cluster structure selection. IEEE Trans Cybern 99(1):1–14. https://doi.org/10.1109/TCYB.2016.2569529
Yu Z, Lu Y, Zhang J, You J, Wong HS, Wang Y, Han G (2017) Progressive semisupervised learning of multiple classifiers. IEEE Trans Cybern 99(1):1–14
Zhang S, Wong HS, Shen Y (2012) Generalized adjusted rand indices for cluster ensembles. Pattern Recogn 45(6):2214–2226
Zhao X, Liang J, Dang C (2017) Clustering ensemble selection for categorical data based on internal validity indices. Pattern Recogn 69:150–168
Zhao XW, Cao FY, Liang JY (2018) A sequential ensemble clusterings generation algorithm for mixed data. Appl Math Comput 335:264–277
Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709
Acknowledgements
This study is supported by grants to HAR and HP. HAR is supported by UNSW Scientia Program Fellowship and is a member of the UNSW Graduate School of Biomedical Engineering.
Author information
Authors and Affiliations
Contributions
HP and HAR designed the study. HP and HAR developed hypothesis and experiments. MRM and HP wrote the manuscript; MRM, HP, and HA edited the manuscript with help from HAR; MRM, HP, HA, SN, and VR carried out the analyses, the implementation of the codes, and the statistical analyses; MRM, HP, and HA generated all figures and tables. All authors have read and approved the final version of the paper.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahmoudi, M.R., Akbarzadeh, H., Parvin, H. et al. Consensus function based on cluster-wise two level clustering. Artif Intell Rev 54, 639–665 (2021). https://doi.org/10.1007/s10462-020-09862-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09862-1