Skip to main content
Log in

Classifier subset selection based on classifier representation and clustering ensemble

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Ensemble pruning can improve the performance and reduce the storage requirements of an integration system. Most ensemble pruning approaches remove low-quality or redundant classifiers by evaluating the classifiers’ competence and relationships via their predictions. However, finding the best way to represent classifiers and create ensemble diversity is still a worthy research problem in the ensemble pruning field. To confront this issue, we discuss whether properties other than predictions can represent classifiers and propose a new classifier selection method, classifier-representation- and clustering-ensemble-based ensemble pruning (CRCEEP). In the proposed method, two new classifier-representation-learning methods, local-space- and relative-transformation-based representation, are proposed to obtain more information about classifiers. CRCEEP incorporates the clustering ensemble method to group classifiers and prune redundant learners. Finally, accurate and diverse classifiers are integrated to improve classification performance. Extensive experiments were carried out on UCI datasets, and the experimental results verify CRCEEP’s effectiveness and the necessity of classifier representation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Algorithm 2
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability Statements

The datasets generated and/or analyzed during the current study are available in the UCI repository, http://archive.ics.uci.edu/ml/.

References

  1. Dong XB, Yu ZW, Cao WM, Shi YF, Ma QL (2020) A survey on ensemble learning. Front Comput Sci 14:241–258. https://doi.org/10.1007/s11704-019-8208-z

    Article  Google Scholar 

  2. Zhou QH, Zhang X, Zhang YD (2022) Ensemble learning with attention-based multiple instance pooling for classification of spt. IEEE Trans Circuits Syst II-Express Br 69:1927–1931. https://doi.org/10.1109/TCSII.2021.3124165

    Google Scholar 

  3. Nemat H, Khadem H, Eissa MR, Elliott J, Benaissa M (2022) Blood glucose level prediction: Advanced deep-ensemble learning approach. IEEE J Biomed Health Inf. https://doi.org/10.1109/JBHI.2022.3144870

  4. Zhou ZH, Wu JX, Tang W (2002) Ensembling neural networks: Many could be better than all. Artif Intell 137:239–263. https://doi.org/10.1016/j.artint.2010.10.001

    Article  MathSciNet  MATH  Google Scholar 

  5. Zhu GL, Dai Q (2021) Ensp(kde)&incl(kde): a hybrid time series prediction algorithm integrating dynamic ensemble pruning, incremental learning, and kernel density estimation. Appl Intell 51:617–645. https://doi.org/10.1007/s10489-020-01802-4

    Article  Google Scholar 

  6. Qasem A, Abdullah SNHS, Sahran S, Albashish D, Goudarzi S, Arasaratnam S (2022) An improved ensemble pruning for mammogram classification using modified bees algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-022-06995-y

  7. Gong ZQ, Zhong P, Hu WD (2019) Diversity in machine learning. IEEE Access 7:64323–64350. https://doi.org/10.1109/ACCESS.2019.2917620

    Article  Google Scholar 

  8. Zyblewski P, Wozniak M (2020) Novel clustering-based pruning algorithms. Pattern Anal Appl 23:1049–1058. https://doi.org/10.1007/s10044-020-00867-8

    Article  MathSciNet  Google Scholar 

  9. Yang DQ, Zhang WY, Wu X, Ablanedo-Rosas JH, Yang LX (2021) A novel multi-stage ensemble model with fuzzy clustering and optimized classifier composition for corporate bankruptcy prediction. J Intell Fuzzy Syst 40:4169–4185. https://doi.org/10.3233/JIFS-200741

    Article  Google Scholar 

  10. Bian YJ, Chen HH (2021) When does diversity help generalization in classification ensembles?. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3053165

  11. Jan MZ, Verma B (2019) A novel diversity measure and classifier selection approach for generating ensemble classifiers. IEEE Access 7:156360–156373. https://doi.org/10.1109/ACCESS.2019.2949059

    Article  Google Scholar 

  12. Zhu XH, Ni ZW, Ni LP, Jin FF, Cheng MY, Li JM (2019) Improved discrete artificial fish swarm algorithm combined with margin distance minimization for ensemble pruning. Comput Ind Eng 128:32–46. https://doi.org/10.1016/j.cie.2018.12.021

    Article  Google Scholar 

  13. Ni ZW, Xia PF, Zhu XH, Ding YF, Ni LP (2020) A novel ensemble pruning approach based on information exchange glowworm swarm optimization and complementarity measure. J Intell Fuzzy Syst 39:8299–8313. https://doi.org/10.3233/JIFS-189149

    Article  Google Scholar 

  14. Asadi S, Roshan SE (2021) A bi-objective optimization method to produce a near-optimal number of classifiers and increase diversity in bagging. Knowledge-based Systems, vol 213. https://doi.org/10.1016/j.knosys.2020.106656

  15. Johnson J, Giraud-Carrier C (2019) Diversity, accuracy and efficiency in ensemble learning: An unexpected result. Intell Data Anal 23:297–311. https://doi.org/10.3233/IDA-183934

    Article  Google Scholar 

  16. Wozniak M, Grana M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17. https://doi.org/10.1016/j.inffus.2013.04.006

    Article  Google Scholar 

  17. Cruz RMO, Sabourin GDCR (2018) Cavalcanti: Dynamic classifier selection: Recent advances and perspectives. Inf Fusion 41:195–216. https://doi.org/10.1016/j.inffus.2017.09.010

    Article  Google Scholar 

  18. Khan I, Zhang XC, Rehman M, Ali R (2020) A literature survey and empirical study of meta-learning for classifier selection. IEEE ACCESS 8:10262–10281. https://doi.org/10.1109/ACCESS.2020.2964726

    Article  Google Scholar 

  19. Sabzevari M, Martinez-Munoz G, Suarez A (2021) Building heterogeneous ensembles by pooling homogeneous ensembles. Int J Mach Learn Cybern, pp 551–558. https://doi.org/10.1007/s13042-021-01442-1

  20. Guo HP, Liu HB, Wu CG, Guo YB, Xu ML (2018) Margin & diversity based ordering ensemble pruning. Neurocomputing 275:237–246. https://doi.org/10.1016/j.neucom.2017.06.052

    Article  Google Scholar 

  21. Dai Q, Ye R, Liu ZA (2017) Considering diversity and accuracy simultaneously for ensemble pruning. Appl Soft Comput 58:75–91. https://doi.org/10.1016/j.asoc.2017.04.058

    Article  Google Scholar 

  22. Buza K, Nanopoulos A, Horvath T, Schmidt-Thieme L (2011) Gramofon: General model-selection framework based on networks. Neurocomputing 75:163–170. https://doi.org/10.1016/j.neucom.2011.02.026

    Article  Google Scholar 

  23. Sagi O, Rokach L (2018) Ensemble learning: A survey. WILEY Interdisciplinary Reviews-Data Mining and Knowledge Discovery, vol 8. https://doi.org/10.1002/widm.1249

  24. Zhao K, Jiang HK, Li XQ, Wang RX (2021) Ensemble adaptive convolutional neural networks with parameter transfer for rotating machinery fault diagnosis. Int J Mach Learn Cybern 12:1483–1499. https://doi.org/10.1007/s13042-020-01249-6

    Article  Google Scholar 

  25. Nguyen KA, Chen W, Lin BS, Seeboonruang U (2021) Comparison of ensemble machine learning methods for soil erosion pin measurements. ISPRS International Journal of Geo-Information, vol 10. https://doi.org/10.3390/ijgi10010042

  26. Kuncheva LI, Whitaker CJ, Shipp CA (2003) Limits on the majority vote accuracy in classifier fusion. Pattern Anal Appl 6:22–31. https://doi.org/10.1007/s10044-002-0173-7

    Article  MathSciNet  MATH  Google Scholar 

  27. Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6:5–20. https://doi.org/10.1016/S1566-2535(04)00037-5

    Article  Google Scholar 

  28. Kohavi R, Wolpert D (1996) Bias plus variance decomposition for zero-one loss functions. Mach Learn, pp 275–283

  29. Shipp CA, Kunchava LI (2002) Relationships between combination methods and measures of diversity in combining classifiers. Inf Fusion 3:135–148. https://doi.org/10.1016/S1566-2535(02)00051-9

    Article  Google Scholar 

  30. Skalak DB (1996) The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the 13th American association for artificial intelligence, integrating multiple learned models workshop, pp 120–125

  31. Wu XG, Ma TH, Cao J, Tian Y, Aladulkarim A (2018) A comparative study of clustering ensemble algorithms. Comput Electr Eng 68:603–615. https://doi.org/10.1016/j.compeleceng.2018.05.005

    Article  Google Scholar 

  32. Khan I, Luo ZW, Shaikh AK, Hedjam R (2021) Ensemble clustering using extended fuzzy k-means for cancer data analysis. Exp Syst Appl, vol 172. https://doi.org/10.1016/j.eswa.2021.114622

  33. Zhou P, Wang X, Du L (2022) Clustering ensemble via structured hypergraph learning. Inf Fusion 78:171–179. https://doi.org/10.1016/j.inffus.2021.09.003

    Article  Google Scholar 

  34. Bian C, Wang XB, Su YC, Wang YH, Wong KC, Li XT (2022) scefsc: Accurate single-cell rna-seq data analysis via ensemble consensus clustering based on multiple feature selections. Comput Struct Biotechnol J 20:2181–2197. https://doi.org/10.1016/j.csbj.2022.04.023

    Article  Google Scholar 

  35. Zhang SH, Yang ZB, Xing XF, Gao Y, Xie DQ, Wong HS (2017) Generalized pair-counting similarity measures for clustering and cluster ensembles. IEEE Access 5:16904–16918. https://doi.org/10.1109/ACCESS.2017.2741221

    Article  Google Scholar 

  36. Liang YN, Ren ZG, Wu ZZ, Zeng DY, Li JZ (2020) Scalable spectral ensemble clustering via building representative co-association matrix. Neurocomputing 390:158–167. https://doi.org/10.1016/j.neucom.2020.01.055

    Article  Google Scholar 

  37. Zhu XS, Li J, Li HD, Xie M, Wang JX (2020) Sc-gpe: A graph partitioning-based cluster ensemble method for single-cell. Front Genet 11. https://doi.org/10.3389/fgene.2020.604790

  38. Xu S, Chan KS, Gao J, Xu XF, Li XF, Hua XP, An J (2016) An integrated k-means - laplacian cluster ensemble approach for document datasets. Neurocomputing 214:495–507. https://doi.org/10.1016/j.neucom.2016.06.034

    Article  Google Scholar 

  39. Zhang XX, Zhu ZF, Zhao Y, Chang DX, Liu J (2019) Seeing all from a few: l(1)-norm-induced discriminative prototype selection. IEEE Trans Neural Netw Learn Syst 30:1954–1966. https://doi.org/10.1109/TNNLS.2018.2875347

    Article  Google Scholar 

  40. Li HH, Wen GH, Jia XP, Lin ZY, Zhao HM, Xiao XL (2021) Augmenting features by relative transformation for small data. Knowl Based Syst 225. https://doi.org/10.1016/j.knosys.2021.107121

  41. Von Luxburg U, Belkin M, Bousquet O (2008) Consistency of spectral clustering. Ann Stat 36:555–586. https://doi.org/10.1214/009053607000000640

    Article  MathSciNet  MATH  Google Scholar 

  42. Wang BJ, Zhang L, Wu CL, Li FZ (2017) Spectral clustering based on similarity and dissimilarity criterion. Pattern Anal Appl 20:495–506. https://doi.org/10.1007/s10044-015-0515-x

    Article  MathSciNet  Google Scholar 

  43. Dua D, Graff C (2017) UCI machine learning database. http://archive.ics.uci.edu/ml/

  44. Cardoso MGMS, de Carvalho APDF (2009) Quality indices for (practical) clustering evaluation. Intell Data Anal 13:725–740. https://doi.org/10.3233/IDA-2009-0390

    Article  Google Scholar 

  45. Jiao BT, Guo YA, Gong DW, Chen QJ (2022) Dynamic ensemble selection for imbalanced data streams with concept drift. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3183120

  46. Ko AHR, Sabourin R, Britto AS (2008) From dynamic classifier selection to dynamic ensemble selection. Pattern Recognit 41:1718–1731. https://doi.org/10.1016/j.patcog.2007.10.015

    Article  MATH  Google Scholar 

  47. Woods K, Kegelmeyer WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410. https://doi.org/10.1109/34.588027

    Article  Google Scholar 

  48. Zhang SA, Chen Y, Zhang WY, Feng RJ (2021) A novel ensemble deep learning model with dynamic error correction and multi-objective ensemble pruning for time series forecasting. Inf Sci:427–445

  49. Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach Learn 81:257–282. https://doi.org/10.1007/s10994-010-5172-0

    Article  MathSciNet  Google Scholar 

  50. Li N, Yu Y, Zhou ZH (2012) Diversity regularized ensemble pruning. In: Machine learning and knowledge discovery in databases, pp 330–345

  51. Martínez Muñoz G, Suárez A (2006) Pruning in ordered bagging ensembles. In: Machine learning, proceedings of the 23rd international inproceedings, pp 609–616

  52. Cruz RMO, Sabourin R, Cavalcanti GDC (2017) Analyzing different prototype selection techniques for dynamic classifier and ensemble selection. In: 2017 International Joint Inproceedings on Neural Networks (IJCNN), pp 3959–3966

Download references

Acknowledgements

This research was financially supported by the National Natural Science Foundation of China (Grant No. 62063002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danyang Li.

Ethics declarations

Statements and declarations

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, D., Zhang, Z. & Wen, G. Classifier subset selection based on classifier representation and clustering ensemble. Appl Intell 53, 20730–20752 (2023). https://doi.org/10.1007/s10489-023-04572-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04572-x

Keywords

Navigation