Abstract
Recently, different boosting algorithms have been proposed in order to improve the performance of classification for imbalanced data. In this paper, we present an improved ADABoost algorithm, called Im.ADABoost, for imbalanced data including two main improvements: (i) initializing different error weights adapted to the imbalance rate of the datasets; (ii) calculating the confidence weights of the member classifier that is sensitive to the total errors caused on the positive label. Additionally, we combine Im.ADABoost with Weighted-SVM to enhance classification efficiency on imbalanced datasets. Our experimental results show some promising potential of the proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30115-8_7
Benjamin, X.W., Nathalie, J.: Boosting support vector machines for imbalanced data sets. Knowl. Inf. Syst. 21, 1–20 (2010). https://doi.org/10.1007/s10115-009-0198-y
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Dong, X., Gao, H., Guo, L., Li, K., Duan, A.: Deep cost adaptive convolutional network: a classification method for imbalanced mechanical data. IEEE Access 8, 71486–71496 (2020). https://doi.org/10.1109/ACCESS.2020.2986419
Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence: 4–10 August 2001, Seattle, vol. 1, pp. 973–978 (2001)
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995). https://doi.org/10.1006/inco.1995.1136
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets. Inf. Sci. 354, 178–196 (2016). https://doi.org/10.1016/j.ins.2016.02.056
Guo, H., Viktor, H.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explor. 6(1), 30–39 (2004). https://doi.org/10.1145/1007730.1007736
Hilario, A., Garcia Lopez, S., Galar, M., Prati, R., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets, Artificial Intelligence. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98074-4_9
Johnson, J., Khoshgoftaar, T.: Survey on deep learning with class imbalance. J. Big Data 6, 1–54 (2019). https://doi.org/10.1186/s40537-019-0192-5
Jordan, M., Mitchell, T.: Machine learning: trends, perspectives, and prospects. Science (New York N.Y.) 349, 255–60 (2015). https://doi.org/10.1126/science.aaa8415
Khang, T.D., Tran, M.K., Fowler, M.: A novel semi-supervised fuzzy c-means clustering algorithm using multiple fuzzification coefficients. Algorithms 14(9), 258 (2021)
Khang, T.D., Vuong, N.D., Tran, M.K., Fowler, M.: Fuzzy c-means clustering algorithm with multiple fuzzification coefficients. Algorithms 13(7), 1–11 (2020)
Lee, W., Jun, C.H., Lee, J.S.: Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification. Inf. Sci. 381, 92–103 (2016). https://doi.org/10.1016/j.ins.2016.11.014
Li, X., Wang, L., Sung, E.: AdaBoost with SVM-based component classifiers. Eng. Appl. Artif. Intell. 21, 785–795 (2008). https://doi.org/10.1016/j.engappai.2007.07.001
Lima, N.H.C., Neto, A.D.D., Dantas de Melo, J.: Creating an ensemble of diverse support vector machines using AdaBoost. In: 2009 International Joint Conference on Neural Networks, pp. 1802–1806 (2009)
Lin, C.F., Wang, S.D.: Fuzzy support vector machines. IEEE Trans. Neural Netw. 13(2), 464–471 (2002). https://doi.org/10.1109/72.991432
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009). https://doi.org/10.1109/TSMCB.2008.2007853
Rengasamy, S., Punniyamoorthy, M.: Performance enhanced boosted SVM for imbalanced datasets. Appl. Soft Comput. 83, 105601 (2019). https://doi.org/10.1016/j.asoc.2019.105601
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Tao, X., et al.: Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw. 122, 289–307 (2020)
Tharwat, A., Gabel, T.: Parameters optimization of support vector machines for imbalanced data using social ski driver algorithm. Neural Comput. Appl. 32(11), 6925–6938 (2019). https://doi.org/10.1007/s00521-019-04159-z
Turki, T., Wei, Z.: Boosting support vector machines for cancer discrimination tasks. Comput. Biol. Med. 101, 236–249 (2018). https://doi.org/10.1016/j.compbiomed.2018.08.006
Yan, Y., Chen, M., Shyu, M.L., Chen, S.C.: Deep learning for imbalanced multimedia data classification. In: 2015 IEEE International Symposium on Multimedia (ISM), pp. 483–488. IEEE, Miami (2015). https://doi.org/10.1109/ISM.2015.126
Zeng, M., Zou, B., Wei, F., Liu, X., Wang, L.: Effective prediction of three common diseases by combining SMOTE with Tomek links technique for imbalanced medical data. In: 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS), pp. 225–228 (2016). https://doi.org/10.1109/ICOACS.2016.7563084
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Quang, V.D., Khang, T.D., Huy, N.M. (2021). Improving ADABoost Algorithm with Weighted SVM for Imbalanced Data Classification. In: Dang, T.K., Küng, J., Chung, T.M., Takizawa, M. (eds) Future Data and Security Engineering. FDSE 2021. Lecture Notes in Computer Science(), vol 13076. Springer, Cham. https://doi.org/10.1007/978-3-030-91387-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-91387-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91386-1
Online ISBN: 978-3-030-91387-8
eBook Packages: Computer ScienceComputer Science (R0)