Skip to main content
Log in

An adaptive boosting algorithm based on weighted feature selection and category classification confidence

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Adaptive boosting (Adaboost) is a typical ensemble learning algorithm, which has been studied and widely used in classification tasks. Traditional Adaboost algorithms ignore the sample weights while selecting the most useful features, and most of them ignore the fact that the performances of weak classifiers on each category are always different. On this basis, a weighted feature selection and category classification confidence based Adaboost algorithm is proposed in this paper. The first contribution, is that we propose a weighted feature selection to select the most useful features, which can both distinguish the majority of all samples and the previous misclassified samples. The second contribution, is that we improve the traditional error rate calculation method and propose a category based error rate calculation method to combine the classification abilities of Adaboost on different categories. A detailed performances comparison of various Adaboost algorithms are carried out on eight typical datasets. The experimental results show that the proposed algorithm obtains significant improvement on classification accuracy compared to typical Adaboost algorithms when different datasets especially the unbalanced datasets are used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Webb GI, Zheng Z (2004) Multistrategy ensemble learning: reducing error by combining ensemble learning techniques [J]. IEEE Trans Knowl Data Eng 16(8):980–991

    Article  Google Scholar 

  2. Fan L, Xu L, Siva P et al (2015) Hyperspectral image classification with limited labeled training samples using enhanced ensemble learning and conditional random fields [J]. IEEE J Sel Top Appl Earth Obs Remote Sens 8(6):1–12

    Google Scholar 

  3. Singh KP, Gupta S, Rai P (2013) Identifying pollution sources and predicting urban air quality using ensemble learning methods [J]. Atmos Environ 80(6):426–437

    Article  Google Scholar 

  4. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging, boosting, and hybrid-based approaches [J]. IEEE Trans Syst Man Cybern 42(4):463–484

    Article  Google Scholar 

  5. Liu H, Cocea M (2019) Nature-inspired framework of ensemble learning for collaborative classification in granular computing context[J]. Granul Comput 4(4):715–724

    Article  Google Scholar 

  6. Thanh PN, Kappas M (2018) Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery [J]. Sensors 18(1):18

    Google Scholar 

  7. Wang Q, Luo ZH, Huang JC et al (2017) A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM [J]. Comput Intell Neurosci 3:1–11

    Google Scholar 

  8. Hido S, Kashima H, Takahashi Y (2010) Roughly balanced bagging for imbalanced data[J]. Statist Anal Data Min 2(5–6):412–426

    MathSciNet  Google Scholar 

  9. Amezcua J, Melin P (2019) A new fuzzy learning vector quantization method for classification problems based on a granular approach[J]. Granul Comput 4(2):197–209

    Article  Google Scholar 

  10. Li X, Wang L, Sung E (2008) AdaBoost with SVM-based component classifiers [J]. Eng Appl Artif Intell 21(5):785–795

    Article  Google Scholar 

  11. Baig MM, Awais MM, El-Alfy ESM (2017) AdaBoost-based artificial neural network learning [J]. Neurocomputing 248:120–126

    Article  Google Scholar 

  12. Yao X, Wang XD, Zhang YX et al (2013) A self-adaption ensemble algorithm based on random subspace and adaBoost [J]. Acta Electron Sin 41(4):810–814

    Google Scholar 

  13. Jidong W, Peng L, Ran R et al (2018) A short-term photovoltaic power prediction model based on the gradient boost decision tree [J]. Appl Sci 8(5):689–703

    Article  Google Scholar 

  14. Zhuo C, Fu J, Cheng Y, et al. (2018) Xgboost classifier for DDoS attack detection and analysis in SDN-based cloud [C] // 2018 IEEE international conference on big data and smart computing (BigComp)

  15. Freund Y, Schapire RE (1997) A decision-theoretic generalization of online learning and an application to boosting [J]. J Comput Syst Sci 55(1):119–139

    Article  MATH  Google Scholar 

  16. Zhang TF, Zhang Q, Liu JY (2017) URL classification method based on AdaBoost and Bayes algorithm [J]. Netinfo Secur 3:66–71

    Google Scholar 

  17. Schapire RE (1999) Improved boosting algorithms using confidence-rated predictions [J]. Mach Learn 37(3):297–336

    Article  MATH  Google Scholar 

  18. Zhu J, Zou H, Rosset S et al (2009) Multi-class adaboost [J]. Stat Interface 2(3):349–360

    Article  MathSciNet  MATH  Google Scholar 

  19. Yang XW, Ma Z, Yuan S (2016) Multi-class Adaboost algorithm based on the adjusted weak classifier [J]. J Electron Inf Technol 38(2):373–380

    Google Scholar 

  20. Freund Y, Schapire RE (1997) A desicion-theoretic generalization of on-line learning and an application to boosting [J]. J Comput Syst Sci 55(1):119–139

    Article  MATH  Google Scholar 

  21. Solomatine D P, Shrestha DL (2004) AdaBoost RT: A boosting algorithm for regression problems [C]. // Proc of the Int Joint Conf on Neural Networks. Budapes, 1163–1168

  22. Chen T, Lu S (2016) Accurate and efficient traffic sign detection using discriminative adaboost and support vector regression [J]. IEEE Trans Veh Technol 65(6):4006–4015

    Article  Google Scholar 

  23. Shu J, Liu ML, Zheng W (2017) Study on AdaBoost-based link quality prediction mechanism [J]. J Commun 01:39–45

    Google Scholar 

  24. Barrow DK, Crone SF (2016) A comparison of AdaBoost algorithms for time series forecast combination[J]. Int J Forecast 32(4):1103–1119

    Article  Google Scholar 

  25. Nayak DR, Dash R, Majhi B (2015) Brain MR image classification using two-dimensional discrete wavelet transform and AdaBoost with random forests [J]. Neurocomputing 177(C):188–197

    Google Scholar 

  26. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization [C]. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412–420

  27. Yang J, Liu Y, Zhu X, Liu Z, Zhang X (2012) A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization [J]. Inf Process Manag 48(4):741–754

    Article  Google Scholar 

  28. Mengle SSR, Goharian N (2009) Ambiguity measure feature-selection algorithm [J]. J Am Soc Inf Sci Technol 60:1037–1050

    Article  Google Scholar 

  29. Wang Y, Feng L, Zhu J (2017) Novel artificial bee colony based feature selection for filtering redundant information [J]. Appl Intell 3:1–18

    Google Scholar 

  30. Tian Y, Wang X (2017) SVM ensemble method based on improved iteration process of Adaboost algorithm [C]// 2017 29th Chinese control and decision conference (CCDC). IEEE

  31. Kohavi R, John GH (1997) Wrappers for feature subset selection [J]. Artif Intell 97(1–2):273–324

    Article  MATH  Google Scholar 

  32. Lu W, Li Z, Chu J (2017) A novel computer-aided diagnosis system for breast MRI based on feature selection and ensemble learning[J]. Comput Biol Med 83:157–165

    Article  Google Scholar 

  33. Benouini R, Batioua I, Ezghari S, Zenkouar K, Zahi A (2020) Fast feature selection algorithm for neighborhood rough set model based on bucket and Trie structures[J]. Granul Comput 5:329–347

    Article  Google Scholar 

  34. Guo H, Li Y, Li Y et al (2016) BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification [J]. Eng Appl Artif Intell 49(C):176–193

    Google Scholar 

  35. Zhao HH, Liu H. Multiple classifiers fusion and CNN feature extraction for handwritten digits recognition[J]. Granul Comput, 2019, 5, pages411–418

  36. Zhang D, Zou L, Zhou X, et al. (2018) Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer [J]. IEEE Access, PP(99):1–1

  37. Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification [J]. Inf Sci 381:92–103

    Article  Google Scholar 

  38. Cao Y, Liu JC, Miao QG et al (2013) Improved behavior-based malware detection algorithm with AdaBoost [J]. J Xidian Univ (Nat Sci) 40(6):116–124

    Google Scholar 

  39. Sun B, Chen S, Wang J, Chen H (2016) A robust multi-class AdaBoost algorithm for mislabeled noisy data [J]. Knowl-Based Syst 102(5):87–102

    Article  Google Scholar 

  40. Aydav PSS, Minz S (2020) Granulation-based self-training for the semi-supervised classification of remote-sensing images [J]. Granul Comput 5:309–327

    Article  Google Scholar 

  41. Yousefi M, Yousefi M, Ferreira RPM, Kim JH, Fogliatto FS (2018) Chaotic genetic algorithm and Adaboost ensemble metamodeling approach for optimum resource planning in emergency departments [J]. Artif Intell Med 84:23–33

    Article  Google Scholar 

  42. Chen YB, Dou P, Yang XJ (2017) Improving land use/cover classification with a multiple classifier system using adaboost integration technique [J]. Remote Sens 9(10):1055–1075

    Article  Google Scholar 

  43. Xiao H, Xiao Z, Wang Y (2016) Ensemble classification based on supervised clustering for credit scoring [J]. Appl Soft Comput 43:73–86

    Article  Google Scholar 

  44. Yang S, Chen L F, Yan T, et al. (2017) An ensemble classification algorithm for convolutional neural network based on AdaBoost [C] // 2017 IEEE/ACIS 16th international conference on computer and information science (ICIS). IEEE Comput Soc

  45. Jalali SMJ, Ahmadian S, Kebria PM, et al. (2019) Evolving Artificial Neural Networks Using Butterfly Optimization Algorithm for Data Classification[C]// Neural Information Processing, 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12-15

  46. Ahmadian S, Khanteymoori AR (2015) Training back propagation neural networks using asexual reproduction optimization[C]// Ikt international conference on Information & Knowledge Technology. IEEE

  47. Seyed MJJ, Sajad A, Abbas K et al (2020) Neuroevolution-based autonomous robot navigation: a comparative study [J]. Cogn Syst Res 62:35–43

    Article  Google Scholar 

  48. Taherkhani A, Cosma G, Mcginnity TM (2020) AdaBoost-CNN: an adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning [J]. Neurocomputing 404:351–366

    Article  Google Scholar 

  49. Xiao C, Chen N, Hu C, Wang K, Gong J, Chen Z (2019) Short and mid-term sea surface temperature prediction using time-series satellite data and LSTM-AdaBoost combination approach [J]. Remote Sens Environ 233:111358

    Article  Google Scholar 

  50. Tang D, Tang L, Dai R, Chen J, Li X, Rodrigues JJPC (2020) MF-Adaboost: LDoS attack detection based on multi-features and improved Adaboost [J]. Futur Gener Comput Syst 106:347–359

    Article  Google Scholar 

  51. Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Department of Information and Computer Science, Irvine

    Google Scholar 

  52. Liu H, Cocea M (2019) Granular computing based approach of rule learning for binary classification[J]. Granul Comput 4:275–283

    Article  Google Scholar 

  53. van der Aalst WMP, Rubin V, Verbeek HMW, et al. Process mining: a two-step approach to balance between underfitting and overfitting[J]. Softw Syst Model, 2010, 9(1):87–111

  54. Chen H, Liu Z, Cai K et al (2017) Grid search parametric optimization for FT-NIR quantitative analysis of solid soluble content in strawberry samples[J]. Vib Spectrosc 94:7–15

    Article  Google Scholar 

  55. Yang J, Qu Z, Liu Z (2014) Improved feature-selection method considering the imbalance problem in text categorization [J]. Sci World J 3:435–451

    Google Scholar 

  56. Sui Y, Wei Y, Zhao DZ (2015) Computer-aided lung nodule recognition by SVM classifier based on combination of random undersampling and SMOTE [J]. Comput Math Methods Med 2015:1–13

    Article  MathSciNet  Google Scholar 

  57. Kang P, Cho S (2006) EUS SVMs: ensemble of under-sampled SVMs for data imbalance problems [C]. // In: Neural Information Processing. Springer, pp. 837–846

  58. Li K, Fang X, Zhai J, Lu Q (2016) An imbalanced data classification method driven by boundary samples-Boundary-Boost [C]. // In: Information Science and Control Engineering (ICISCE), 3rd International Conference on. IEEE, pp. 194–199

Download references

Acknowledgments

This research is supported by the Ministry of education of Humanities and Social Science project (No. 19YJCZH178), the National Natural Science Foundation of China (No. 61906220), the National Social Science Foundation of China (No.18CTJ008), the Natural Science Foundation of Tianjin Province (No. 18JCQNJC69600) , the National Natural Science Foundation of China (No. 61672104) and the National Key R&D Program of China (2017YFB1400700).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lizhou Feng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Feng, L. An adaptive boosting algorithm based on weighted feature selection and category classification confidence. Appl Intell 51, 6837–6858 (2021). https://doi.org/10.1007/s10489-020-02184-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02184-3

Keywords

Navigation