Skip to main content
Log in

Research on network abnormal data flow mining based on improved cluster analysis

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Aiming at the problems of traditional methods that cannot adapt to the interference of noise or abnormal data, the data mining time is long, and the data mining accuracy is low, a network abnormal data stream mining method based on improved clustering analysis is proposed. By establishing a preprocessing model for abnormal network data flow, real-time data flow query is realized. Construct a network abnormal incremental data classification model to reduce the interference of noise data on data processing. The least square method is used to further filter the interference data in the abnormal incremental data of the network, and obtain the quantized data stream. Statistic network abnormal data frequent pattern data set, on this basis, adopt improved clustering method to complete the mining of network abnormal data stream. The experimental results show that the highest anti-noise coefficient of the proposed method is 0.7, and the data mining time is shorter, and the data mining accuracy is higher, which fully verifies the data stream mining performance of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Cheng, K.C., Huang, M.J., Fu, C.K., Wang, K.H., Wang, H.M., Lin, L.H.: Establishing a multiple-criteria decision-making model for stock investment decisions using data mining techniques. Sustainability 13(6), 3100 (2021)

    Article  Google Scholar 

  2. Mansouri, N., Javidi, M.M., Zade, B.: Using data mining techniques to improve replica management in cloud environment. Soft. Comput. 24(10), 7335–7360 (2020)

    Article  Google Scholar 

  3. Liu, S., Sun, L., Zhu, S., Li, J., Chen, X., Zhong, W.: Operation strategy optimization of desulfurization system based on data mining. Appl. Math. Model. 81(5), 144–158 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  4. Luo, Z., Hong, S.H., Ding, Y.M.: A data mining-driven incentive-based demand response scheme for a virtual power plant. Appl. Energy 239(4), 549–559 (2019)

    Article  Google Scholar 

  5. Wu, J.J.: Data mining method for abnormal nodes of high load grating sensing network. Laser J. 40(02), 68–72 (2019)

    Google Scholar 

  6. Han, W.B.: Simulation of accurate mining for non-uniform sampling data in open network. Comput. Simul. 37(08), 343–394 (2020)

    Google Scholar 

  7. Tian, H., He, Y.: Big data mining based on around-centroid clustering algorithm. Appl. Res. Comput. 350(12), 72–75 (2020)

    Google Scholar 

  8. Yu, W.: Discovering frequent movement paths from taxi trajectory data using spatially embedded networks and association rules. IEEE Trans. Intell. Transp. Syst. 20(3), 855–866 (2019)

    Article  Google Scholar 

  9. Joo, S., Lu, K., Lee, T.: Analysis of content topics, user engagement and library factors in public library social media based on text mining. Online Inf. Rev. 44(1), 258–277 (2020)

    Article  Google Scholar 

  10. Xia, D., Ning, F., He, W.: Research on parallel adaptive canopy-K-means clustering algorithm for big data mining based on cloud platform. J. Grid Comput. 18(2), 263–273 (2020)

    Article  Google Scholar 

  11. Cominola, A., Nguyen, K., Giuliani, M., Stewart, R.A., Maier, H.R., Castelletti, A.: Data mining to uncover heterogeneous water use behaviors from smart meter data. Water Resour. Res. 55(11), 9315–9333 (2019)

    Article  Google Scholar 

  12. Wu, Y., Liu, Y., Ahmed, S.H., Peng, J., El-Latif, A.A.: Dominant data set selection algorithms for electricity consumption time-series data analysis based on affine transformation. IEEE Internet Things J. 7(5), 4347–4360 (2020). https://doi.org/10.1109/jiot.2019.2946753

    Article  Google Scholar 

  13. Zhang, J.: Interaction design research based on large data rule mining and blockchain communication technology. Soft. Comput. 24(21), 16593–16604 (2020)

    Article  Google Scholar 

  14. El-Latif, A.A., Abd-El-Atty, B., Venegas-Andraca, S.E., Mazurczyk, W.: Efficient quantum-based security protocols for information sharing and data protection in 5G networks. Futur. Gener. Comput. Syst. 100, 893–906 (2019). https://doi.org/10.1016/j.future.2019.05.053

    Article  Google Scholar 

  15. Zhou, X., Huang, L., Zhang, Y., Yu, M.: A hybrid approach to detecting technological recombination based on text mining and patent network analysis. Scientometrics 121(2), 699–737 (2019)

    Article  Google Scholar 

  16. Wang, Y., Ye, H., Zhang, T., Zhang, H.: A data mining method based on unsupervised learning and spatiotempporal analysis for sheath current monitoring. Neurocomputing 352(8), 54–63 (2019)

    Article  Google Scholar 

  17. Griffiths, D., Boehm, J.: A review on deep learning techniques for 3D sensed data classification. Remote Sens. 11(12), 1499 (2019)

    Article  Google Scholar 

  18. Mathan, K., et al.: A novel gini index decision tree data mining method with neural network classifiers for prediction of heart disease. Des. Autom. Embed. Syst. 22(3), 225–242 (2018). https://doi.org/10.1007/s10617-018-9205-4

    Article  Google Scholar 

  19. Zhang, X., Wang, D., Zhou, Y., Chen, H., Cheng, F., Liu, M.: Kernel modified optimal margin distribution machine for imbalanced data classification. Patt. Recogn. Lett. 125(6), 325–332 (2019)

    Article  Google Scholar 

  20. Nguyen, N.-T., Leu, M.C., Zeadally, S., Liu, B.-H., Chu, S.-I.: Optimal solution for data collision avoidance in radio frequency identification networks. Internet Technol. Lett. 1, e49 (2018). https://doi.org/10.1002/itl2.49

    Article  Google Scholar 

  21. Hammad, M., Alkinani, M.H., Gupta, B.B., El-Latif, A.A.: Myocardial infarction detection based on deep neural network on imbalanced data. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-020-00728-8

    Article  Google Scholar 

  22. Rojas, J., Marin, C.E., García, P.A., Forero, J., Crespo, R.G.: Analysis of physico-chemical variables and their influence on water quality of the Bogota River using data mining. Int. J. High Perform. Syst. Archit. 8(1/2), 3 (2018). https://doi.org/10.1504/ijhpsa.2018.10015187

    Article  Google Scholar 

  23. Gomathi, N., Karlekar, N.P.: Ontology and hybrid optimization based SVNN for privacy preserved medical data classification in cloud. Int. J. Artif. Intell. Tools 28(3), 1950009 (2019)

    Article  Google Scholar 

  24. Khan, N., Anwar, S.: Time-domain data fusion using weighted evidence and Dempster-Shafer combination rule: application in object classification. Sensors 19(23), 5187 (2019)

    Article  Google Scholar 

  25. Lan, Z.W., Yuan, J., Ren, Z.K.: Big data mining method for intrusion monitoring of multi-source communication research and development institutions. Comput. Simul. 38(01), 350–353 (2021)

    Google Scholar 

  26. Rajakumari, K., Punitha P., Lakshmana Kumar, R., Suresh, C.: Improvising packet delivery and reducing delay ratio in mobile ad hoc network using neighbor coverage-based topology control algorithm. Int. J. Commun. Syst. (2019)

  27. Sathishkumar, V.E, Park, J., Cho, Y.: Seoul bike trip duration prediction using data mining techniques. IET Intel. Transport Syst. 14(11), 1465–1474 (2020). https://doi.org/10.1049/iet-its.2019.0796

  28. Gao, J., Wang, H., Shen, H.: Task failure prediction in cloud data centers using deep learning. IEEE Trans. Serv. Comput. (2020). https://doi.org/10.1109/tsc.2020.2993728

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoqiang Jia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, X. Research on network abnormal data flow mining based on improved cluster analysis. Distrib Parallel Databases 40, 797–813 (2022). https://doi.org/10.1007/s10619-021-07353-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-021-07353-y

Keywords

Navigation