A Hybrid Framework for Class-Imbalanced Classification

Chen, Rui; Luo, Lailong; Chen, Yingwen; Xia, Junxu; Guo, Deke

doi:10.1007/978-3-030-85928-2_24

Rui Chen¹¹,
Lailong Luo¹²,
Yingwen Chen¹¹,
Junxu Xia¹² &
…
Deke Guo¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12937))

Included in the following conference series:

International Conference on Wireless Algorithms, Systems, and Applications

1855 Accesses

Abstract

Data classification is a commonly used data processing method in the fields of networks and distributed systems, and it has attracted extensive attention in recent years. Nevertheless, the existing classification algorithms are mainly aimed at relatively balanced datasets, while the data in reality often exhibits imbalanced characteristics. In this paper, we propose a novel Hybrid Resampling-based Ensemble (HRE) model, which aims to solve the classification problem of highly skewed data. The main idea of the HRE is to leverage the resampling approach for tackling class imbalance, and then twelve classifiers are further adopted to construct an ensemble model. Besides, a novel combination of under-sampling and over-sampling is elaborately proposed to balance the heterogeneity among different data categories. We decide the resampling rate in an empirical manner, which provides a practical guideline for the use of sampling methods. We compare the effect of different resampling methods based on the imbalanced network anomaly detection dataset, where few abnormal data need to be distinguished from a large number of common network traffics. The results of extensive experiments show that the HRE model achieves better accuracy performance than the methods without hybrid resampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tsai, C.F., Lin, W.C., Hu, Y.H., Yao, G.T.: Under-sampling class imbalanced datasets by combining clustering analysis and instance selection. Inf. Sci. 477, 47–54 (2019)
Article Google Scholar
Li, H.: A divide and conquer approach for imbalanced multi-class classification and its application to medical decision making. Pak. J. Pharm. Sci. 29 (2016)
Google Scholar
Mahajan, V., Misra, R., Mahajan, R.: Review of data mining techniques for churn prediction in telecom. J. Inf. Organ. Sci. 39(2), 183–197 (2015)
Google Scholar
Liu, Y., Wang, J., Niu, S., Song, H.: Deep learning enabled reliable identity verification and spoofing detection. In: Yu, D., Dressler, F., Yu, J. (eds.) WASA 2020. LNCS, vol. 12384, pp. 333–345. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59016-1_28
Chapter Google Scholar
Zafeiriou, S., Zhang, C., Zhang, Z.: A survey on face detection in the wild: past, present and future. Comput. Vis. Image Underst. 138, 1–24 (2015)
Article Google Scholar
West, J., Bhattacharya, M.: Intelligent financial fraud detection: a comprehensive review. Comput. Secur. 57, 47–66 (2016)
Article Google Scholar
Kang, S., Cho, S., Kang, P.: Constructing a multi-class classifier using one-against-one approach with different binary classifiers. Neurocomputing 149, 677–682 (2015)
Article Google Scholar
Luo, M., Wang, K., Cai, Z., Liu, A., Li, Y., Cheang, C.F.: Using imbalanced triangle synthetic data for machine learning anomaly detection. Comput. Mater. Continua 58(1), 15–26 (2019)
Article Google Scholar
Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2), 195–215 (1998)
Article Google Scholar
Fawcett, T., Provost, F.: Adaptive fraud detection. Data Min. Knowl. Disc. 1(3), 291–316 (1997)
Article Google Scholar
Pelayo, L., Dick, S.: Applying novel resampling strategies to software defect prediction. In: NAFIPS 2007–2007 Annual Meeting of the North American Fuzzy Information Processing Society, pp. 69–72. IEEE (2007)
Google Scholar
Yijing, L., Haixiang, G., Xiao, L., Yanan, L., Jinling, L.: Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl.-Based Syst. 94, 88–104 (2016)
Article Google Scholar
Tao, X., Peng, Y., Zhao, F., Wang, S.F., Liu, Z.: An improved parallel network traffic anomaly detection method based on bagging and GRU. In: Yu, D., Dressler, F., Yu, J. (eds.) WASA 2020. LNCS, vol. 12384, pp. 420–431. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59016-1_35
Chapter Google Scholar
Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 1–50 (2016)
Article Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chapter Google Scholar
Young, W.A., Nykl, S.L., Weckman, G.R., Chelberg, D.M.: Using voronoi diagrams to improve classification performances when modeling imbalanced datasets. Neural Comput. Appl. 26(5), 1041–1054 (2015)
Article Google Scholar
Ando, S.: Classifying imbalanced data in distance-based feature space. Knowl. Inf. Syst. 46(3), 707–730 (2015). https://doi.org/10.1007/s10115-015-0846-3
Article Google Scholar
López, V., Del Río, S., Benítez, J.M., Herrera, F.: Cost-sensitive linguistic fuzzy rule based classification systems under the mapreduce framework for imbalanced big data. Fuzzy Sets Syst. 258, 5–38 (2015)
Article MathSciNet Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)
Article MathSciNet Google Scholar
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39804-2_12
Chapter Google Scholar
Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 40(1), 185–197 (2009)
Article Google Scholar
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2008)
Google Scholar
Nanni, L., Fantozzi, C., Lazzarini, N.: Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158, 48–61 (2015)
Article Google Scholar

Download references

Acknowledgement

The work is supported by the National Key Research and Development Program of China under grant 2018YFB0204301, the National Natural Science Foundation (NSF) under grant 62072306 and 62002378, Tianjin Science and Technology Foundation under Grant No.18ZXJMTG00290, Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory under grant 6142110200407.

Author information

Authors and Affiliations

College of Computer, National University of Defense Technology, Changsha, China
Rui Chen & Yingwen Chen
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China
Lailong Luo, Junxu Xia & Deke Guo

Authors

Rui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lailong Luo
View author publications
You can also search for this author in PubMed Google Scholar
Yingwen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Junxu Xia
View author publications
You can also search for this author in PubMed Google Scholar
Deke Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lailong Luo or Yingwen Chen .

Editor information

Editors and Affiliations

Nanjing University of Aeronautics and Astronautics, Nanjing, China
Zhe Liu
Shanghai Jiao Tong University, Shanghai, China
Fan Wu
Missouri University of Science and Technology, Rolla, MO, USA
Sajal K. Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, R., Luo, L., Chen, Y., Xia, J., Guo, D. (2021). A Hybrid Framework for Class-Imbalanced Classification. In: Liu, Z., Wu, F., Das, S.K. (eds) Wireless Algorithms, Systems, and Applications. WASA 2021. Lecture Notes in Computer Science(), vol 12937. Springer, Cham. https://doi.org/10.1007/978-3-030-85928-2_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-85928-2_24
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85927-5
Online ISBN: 978-3-030-85928-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Hybrid Framework for Class-Imbalanced Classification