Skip to main content
Log in

FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

As the 5G rolls out around the world, many edge applications will be deployed by app vendors and accessed by massive end-users. Efficient detection of malicious network behavior is paid more and more attention. The current traffic detection work is still stuck on the analysis of high-dimensional data. It will restrict the improvement of threat monitoring and network governance when facing massive network flows. Characterization of network flows within simple domains is required to simplify the process of network analysis. Traffic characterization is a key task that allows service providers to detect and intercept anomalous traffic, such that high QoS (Quality of Service) and service availability are maintained and spread of malicious content is prevented. Unfortunately, there is still a lack of research on the concrete characterization of network data. Analogous to spectrum, in this paper, we proposed the concept of FlowSpectrum for the first time in order to represent the network flow, concretely. In the FlowSpectrum, network flow is represented as a spectral line rather than the raw data or a feature vector of the network flow. All flows are able to be mapped as spectral lines, and traffic identification is achieved by analyzing the positions of spectral lines. FlowSpectrum can significantly reduce the complexity of network traffic behavior analysis while enhancing the interpretability of detection and facilitating cyberspace behavior management. We designed a neural network structure based on semi-supervised AutoEncoder for decomposition and dimensionality reduction of network flows in FlowSpectrum. The characterization capability of FlowSpectrum is proved by thorough experiments. Moreover, we realized the correspondence between network behaviors and intervals of spectral lines, preliminarily. Generally speaking, FlowSpectrum can provide new ideas for the field of network traffic analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10

Similar content being viewed by others

Notes

  1. The NSL-KDD dataset can be download from https://www.unb.ca/cic/datasets/nsl.html.

  2. The experimental source code is hosted at https://anonymous.4open.science/r/FlowSpectrum-8DA4.

References

  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp. 265–283 (2016)

  2. Bouzida, Y., Cuppens, F., Cuppens-Boulahia, N., Gombault, S.: Efficient intrusion detection using principal component analysis. In: 3éme Conférence sur la Sécurité et Architectures Réseaux (SAR), La Londe, France, pp. 381–395 (2004)

  3. Chen, Y., Ashizawa, N., Yean, S., Yeo, C.K., Yanai, N.: Self-organizing map assisted deep autoencoding gaussian mixture model for intrusion detection. In: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. IEEE (2021)

  4. Chen, Y., Ashizawa, N., Yeo, C.K., Yanai, N., Yean, S.: Multi-scale self-organizing map assisted deep autoencoding gaussian mixture model for unsupervised intrusion detection. Knowledge-Based Systems p. 107086 (2021)

  5. Chen, Y., Zhang, J., Yeo, C.K.: Network anomaly detection using federated deep autoencoding gaussian mixture model. In: International Conference on Machine Learning for Networking, pp. 1–14. Springer (2019)

  6. Chen, Z., He, K., Li, J., Geng, Y.: Seq2img: A sequence-to-image based approach towards ip traffic classification using convolutional neural networks. In: 2017 IEEE International Conference on Big Data (big data), pp. 1271–1276. IEEE (2017)

  7. Corchado, E., Herrero, Á.: Neural visualization of network traffic data for intrusion detection. Applied Soft Computing 11(2), 2042–2056 (2011)

    Article  Google Scholar 

  8. Draper-Gil, G., Lashkari, A.H., Mamun, M.S.I., Ghorbani, A.A.: Characterization of encrypted and vpn traffic using time-related. In: Proceedings of the 2nd International Conference on Information Systems Security and Privacy, pp. 407–414 (2016)

  9. Elkhadir, Z., Chougdali, K., Benattou, M.: Intrusion detection system using pca and kernel pca methods. In: Proceedings of the Mediterranean Conference on Information & Communication Technologies 2015, pp. 489–497. Springer (2016)

  10. Ferreira, D.C., Vázquez, F.I., Zseby, T.: Extreme dimensionality reduction for network attack visualization with autoencoders. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2019)

  11. George, A., Vidyapeetham, A.: Anomaly detection based on machine learning dimensionality reduction using pca and classification using svm. International Journal of Computer Applications 47(21), 5–8 (2012)

    Article  Google Scholar 

  12. Haiyan, W., Haomin, Y., Xueming, L., Haijun, R.: Semi-supervised autoencoder: A joint approach of representation and classification. In: 2015 International Conference on Computational Intelligence and Communication Networks (CICN), pp. 1424–1430. IEEE (2015)

  13. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  14. Hyvärinen, A., Oja, E.: Independent component analysis: Algorithms and applications. Neural Networks 13(4–5), 411–430 (2000)

    Article  Google Scholar 

  15. Ikram, S.T., Cherukuri, A.K.: Improving accuracy of intrusion detection model using pca and optimized svm. Journal of Computing and Information Technology 24(2), 133–148 (2016)

    Article  Google Scholar 

  16. Imran, H.M., Abdullah, A.B., Hussain, M., Palaniappan, S., Ahmad, I.: Intrusions detection based on optimum features subset and efficient dataset selection. International Journal of Engineering and Innovative Technology 2(6), 265–270 (2012)

    Google Scholar 

  17. Javaid, A., Niyaz, Q., Sun, W., Alam, M.: A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), pp. 21–26 (2016)

  18. Ji, S.Y., Jeong, B.K., Choi, S., Jeong, D.H.: A multi-level intrusion detection method for abnormal network behaviors. Journal of Network and Computer Applications 62, 9–17 (2016)

    Article  Google Scholar 

  19. Kaiser, H.F.: The varimax criterion for analytic rotation in factor analysis. Psychometrika 23(3), 187–200 (1958)

    Article  Google Scholar 

  20. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)

  21. Korczyński, M., Duda, A.: Markov chain fingerprinting to classify encrypted traffic. In: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 781–789. IEEE (2014)

  22. Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A.: Characterization of tor traffic using time based features. In: International Conference on Information Systems Security and Privacy (ICISSP), pp. 253–262 (2017)

  23. Liu, C., Cao, Z., Xiong, G., Gou, G., Yiu, S.M., He, L.: Mampf: Encrypted traffic classification based on multi-attribute markov probability fingerprints. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE (2018)

  24. Liu, C., He, L., Xiong, G., Cao, Z., Li, Z.: Fs-net: A flow sequence network for encrypted traffic classification. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1171–1179. IEEE (2019)

  25. McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Transactions on Information and System Security (TISSEC) 3(4), 262–294 (2000)

    Article  Google Scholar 

  26. Pan, W., Cheng, G., Tang, Y.: Wenc: Https encrypted traffic classification using weighted ensemble learning and markov chain. In: 2017 IEEE Trustcom/BigDataSE/ICESS, pp. 50–57 (2017). https://doi.org/10.1109/Trustcom/BigDataSE/ICESS.2017.219

  27. Ruan, Z., Miao, Y., Pan, L., Patterson, N., Zhang, J.: Visualization of big data security: A case study on the kdd99 cup data set. Digital Communications and Networks 3(4), 250–259 (2017)

    Article  Google Scholar 

  28. Santos, A.C.F., da Silva, J.D.S., de Sá Silva, L., da Costa Sene, M.P.: Network traffic characterization based on time series analysis and computational intelligence. J. Computational Interdisciplinary Sciences 2(3), 197–205 (2011)

    Google Scholar 

  29. Sathya, S.S., Ramani, R.G., Sivaselvi, K.: Discriminant analysis based feature selection in kdd intrusion dataset. International Journal of computer applications 31(11), 1–7 (2011)

    Google Scholar 

  30. Shapira, T., Shavitt, Y.: Flowpic: Encrypted internet traffic classification is as easy as image recognition. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 680–687. IEEE (2019)

  31. Shen, M., Wei, M., Zhu, L., Wang, M.: Classification of encrypted traffic with second-order markov chains and application attribute bigrams. IEEE Transactions on Information Forensics and Security 12(8), 1830–1843 (2017)

    Article  Google Scholar 

  32. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009)

  33. Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Computation 11(2), 443–482 (1999)

    Article  Google Scholar 

  34. Wang, W., Zhu, M., Wang, J., Zeng, X., Yang, Z.: End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 43–48. IEEE (2017)

  35. Wang, W., Zhu, M., Zeng, X., Ye, X., Sheng, Y.: Malware traffic classification using convolutional neural network for representation learning. In: 2017 International Conference on Information Networking (ICOIN), pp. 712–717. IEEE (2017)

  36. Waskle, S., Parashar, L., Singh, U.: Intrusion detection system using pca with random forest approach. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 803–808. IEEE (2020)

  37. Xu, X., Wang, X.: An adaptive network intrusion detection method based on pca and support vector machines. In: International Conference on Advanced Data Mining and Applications, pp. 696–703. Springer (2005)

  38. Yao, R., Liu, C., Zhang, L., Peng, P.: Unsupervised anomaly detection using variational auto-encoder based feature extraction. In: 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), pp. 1–7. IEEE (2019)

  39. Yousefi-Azar, M., Varadharajan, V., Hamey, L., Tupakula, U.: Autoencoder-based feature learning for cyber security applications. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3854–3861. IEEE (2017)

  40. Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, H.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)

  41. Zong, W., Chow, Y.W., Susilo, W.: A 3d approach for the visualization of network intrusion detection data. In: 2018 International Conference on Cyberworlds (CW), pp. 308–315. IEEE (2018)

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2018YFB0204301, No. 2018YFB0805004), National Nature Science Foundation of China (No.62072466, No.U1811462) and the NUDT Grants (No. ZK19-38). Dr Xuyn Zhang is the recipient of an ARC DECRA (project No. DE210101458) funded by the Australian Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaojing Fu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

This article belongs to the Topical Collection: Special Issue on Resource Management at the Edge for Future Web, Mobile and IoT Applications

Guest Editors: Qiang He, Fang Dong, Chenshu Wu, and Yun Yang

Appendix A: The decomposer configuration

Appendix A: The decomposer configuration

Our experiments are performed on the following configuration: Intel(R) Core(TM) i7-7700HQ CPU @ 2.8GHz, NVIDA GeForce GTX 1050 Ti, 32GB of RAM.

All the neural network structures are implemented by tensorflow [1] (version 2.4.1) and trained by Adam optimization algorithm [20] with learning rate 0.001. The number of training epochs is 100, and the batch size is set as 256. For the loss function mentioned in (11), the weight adjustment parameter \(\alpha\) is set as 0.5.

For NSL-KDD dataset, we designed a decomposer based on semi-supervised AutoEncoder for FlowSpectrum. The Encoder runs with

$$\begin{aligned}&FC(121,64,tanh)-FC(64,32,tanh)-FC(32,16,tanh)\nonumber \\&- FC(16,8,tanh)-FC(8,2) \end{aligned}$$
(A1)

and the Decoder runs with

$$\begin{aligned}&FC(2,8,tanh)-FC(8,16,tanh)-FC(16,32,tanh)\nonumber \\&- FC(32,64,tanh)-FC(64,121) \end{aligned}$$
(A2)

The classifier network performs with FC(1, 5, softmax). As for the AutoEncoder we used in this paper, its Encoder and Decoder are the same as those in semi-supervised AutoEncoder.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, L., Fu, S., Zhang, X. et al. FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection. World Wide Web 25, 2139–2161 (2022). https://doi.org/10.1007/s11280-022-01057-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-022-01057-8

Keywords

Navigation