Skip to main content
Log in

Network anomaly detection based on selective ensemble algorithm

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Duan LX, Guo H, Wang JJ (2016) A mechanical fault severity identification method under unbalanced datasets. J Vib Shock China 35(20):178–182

    Google Scholar 

  2. Du H, Teng S, Zhang L (2016) Support vector machine based on dynamic density equalization. In: Human Centered Computing 2nd International Conference 2016, Lecture Notes in Computer Science(LNCS), vol 9567. Springer Verlag, Berlin, pp 58–69

  3. Liu G, Chen Z, Zhuang Z, Guo W, Chen G (2020) A unified algorithm based on HTS and self-adapting PSO for the construction of octagonal and rectilinear SMT. Soft Comput 24(6):3943–3961. https://doi.org/10.1007/s00500-019-04165-2

    Article  Google Scholar 

  4. Wang J, Zhang XM, Lin Y et al (2018) Event-triggered dissipative control for networked stochastic systems under non-uniform sampling. Inf Sci. https://doi.org/10.1016/j.ins.2018.03.003

    Article  MATH  Google Scholar 

  5. Jian C, Gao J, Ao Y (2016) A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 193(1):115–122

    Article  Google Scholar 

  6. Li YH, Lou XG, Qin YK et al (2015) RMPCM: network-wide anomaly detection method based on robust multivariate probabilistic calibration model. J Commun Chin 36(11):201–212 (in Chinese)

    Google Scholar 

  7. Zou J, Dong L, Wu W (2018) New algorithms for the unbalanced generalized birthday problem. IET Inf Secur. https://doi.org/10.1049/iet-ifs.2017.0495

    Article  Google Scholar 

  8. Han M, Lu F (2015) Selective ensemble of extreme learning machine with kernels based on mutual information. Control Decis 30(11):2089–2092 (in Chinese)

    MATH  Google Scholar 

  9. Zhong S, Chen T, He F et al (2014) Fast Gaussian kernel learning for classification tasks based on specially structured global optimization. Neural Networks 57:51–62

    Article  Google Scholar 

  10. Guo WZ, Chen JY, Chen GL et al (2015) Trust dynamic task allocation algorithm with Nash equilibrium for heterogeneous wireless sensor network. Secur Commun Networks 8(10):1865–1877

    Article  MathSciNet  Google Scholar 

  11. Wang Q, Luo ZH, Huang JC et al (2017) A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Comput Intell Neurosci 2017(3):1827016

    Google Scholar 

  12. Du HL (2016) Algorithm for imbalanced dataset based on K-nearest neighbor in kernel space. J Front Comput Sci Technol 9(7):869–876 (in Chinese)

    Google Scholar 

  13. Zhou YH, Zhou ZH (2016) Large margin distribution learning with cost interval and unlabeled data. IEEE Trans Knowl Data Eng 28(7):1749–1763

    Article  Google Scholar 

  14. Haque MN, Noman N, Berretta R et al (2016) Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification. PLoS ONE 11(1):e0146116

    Article  Google Scholar 

  15. Wang S, Minku LL, Yao X (2015) Resampling-based ensemble methods for online class imbalance learning. IEEE Trans Knowl Data Eng 27(5):1356–1368

    Article  Google Scholar 

  16. Zhou JX, Zhou ZH, Shen XH et al (2000) A selective constructing approach to neural network ensemble. J Calcul Res Dev 37(9):1039–1044

    Google Scholar 

  17. Zhou ZH, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2):239–263

    Article  MathSciNet  Google Scholar 

  18. Zhu ZH, Wang Z, Li DD et al (2020) Geometric structural ensemble learning for imbalanced problems. IEEE Transac Cyber 50(4):1617–1629

    Article  Google Scholar 

  19. Guo H, Li Y, Li Y et al (2016) BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artif Intell 49(C):176–193

    Google Scholar 

  20. Potharaju SP, Sreedevi M (2017) Ensembled rule based classification algorithms for predicting imbalanced kidney disease data. J Eng Sci Technol Rev 9(5):201–207

    Article  Google Scholar 

  21. Ng WWY, Hu J, Yeung DSS et al (2017) Diversified sensitivity-based undersampling for imbalance classification problems. IEEE Trans Cybern 45(11):2402–2412

    Article  Google Scholar 

  22. Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern B Cybern 39(2):539–550

    Article  Google Scholar 

  23. Zhai J, Zhang S, Wang C (2017) The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers. Int J Mach Learn Cyber 8(3):1009–1017

    Article  Google Scholar 

  24. Yu H, Ni J (2014) An improved ensemble learning method for classifying high-dimensional and imbalanced biomedicine data. IEEE/ACM Trans Comput Biol Bioinf 11(4):1339–1347

    Article  Google Scholar 

  25. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  26. Zhang Y, Yang A, Xiong C et al (2014) Feature selection using data envelopment analysis. Knowl-Based Syst 64:70–80

    Article  Google Scholar 

  27. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207

    Article  Google Scholar 

  28. Tao XL, Kang RN, Liu LY (2018) A parallel multi-classifier fusion approach based on selective ensemble. Comput Eng Sci China 40(5):787–792

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Natural Science Foundation Research Project of Shaanxi Province Foundation of China (No. 2019KRM095); Science and Technology Plan Project of Shangluo of China (No. SK2019-84); Science and Technology Research Project of Shangluo University (No. 18SKY014); Science and Technology Innovation Team Building Project of Shangluo University (No. 18SCX002); and Key Discipline Construction Project of Shangluo University, Subject Name: Mathematics.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongle Du.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, H., Zhang, Y. Network anomaly detection based on selective ensemble algorithm. J Supercomput 77, 2875–2896 (2021). https://doi.org/10.1007/s11227-020-03374-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03374-z

Keywords

Navigation