FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection

Yang, Luming; Fu, Shaojing; Zhang, Xuyun; Guo, Shize; Wang, Yongjun; Yang, Chi

doi:10.1007/s11280-022-01057-8

FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection

Published: 28 April 2022

Volume 25, pages 2139–2161, (2022)
Cite this article

World Wide Web Aims and scope Submit manuscript

Luming Yang¹,
Shaojing Fu¹,
Xuyun Zhang²,
Shize Guo³,
Yongjun Wang¹ &
…
Chi Yang⁴

419 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

As the 5G rolls out around the world, many edge applications will be deployed by app vendors and accessed by massive end-users. Efficient detection of malicious network behavior is paid more and more attention. The current traffic detection work is still stuck on the analysis of high-dimensional data. It will restrict the improvement of threat monitoring and network governance when facing massive network flows. Characterization of network flows within simple domains is required to simplify the process of network analysis. Traffic characterization is a key task that allows service providers to detect and intercept anomalous traffic, such that high QoS (Quality of Service) and service availability are maintained and spread of malicious content is prevented. Unfortunately, there is still a lack of research on the concrete characterization of network data. Analogous to spectrum, in this paper, we proposed the concept of FlowSpectrum for the first time in order to represent the network flow, concretely. In the FlowSpectrum, network flow is represented as a spectral line rather than the raw data or a feature vector of the network flow. All flows are able to be mapped as spectral lines, and traffic identification is achieved by analyzing the positions of spectral lines. FlowSpectrum can significantly reduce the complexity of network traffic behavior analysis while enhancing the interpretability of detection and facilitating cyberspace behavior management. We designed a neural network structure based on semi-supervised AutoEncoder for decomposition and dimensionality reduction of network flows in FlowSpectrum. The characterization capability of FlowSpectrum is proved by thorough experiments. Moreover, we realized the correspondence between network behaviors and intervals of spectral lines, preliminarily. Generally speaking, FlowSpectrum can provide new ideas for the field of network traffic analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 5

Multiscale Internet Statistics: Unveiling the Hidden Behavior

Characterizing Network Flows for Detecting DNS, NTP, and SNMP Anomalies

Uncovering network traffic anomalies based on their sparse distributions

Article 23 April 2014

GuoZhen Cheng, HongChang Chen, … JuLong Lan

Notes

The NSL-KDD dataset can be download from https://www.unb.ca/cic/datasets/nsl.html.
The experimental source code is hosted at https://anonymous.4open.science/r/FlowSpectrum-8DA4.

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp. 265–283 (2016)
Bouzida, Y., Cuppens, F., Cuppens-Boulahia, N., Gombault, S.: Efficient intrusion detection using principal component analysis. In: 3éme Conférence sur la Sécurité et Architectures Réseaux (SAR), La Londe, France, pp. 381–395 (2004)
Chen, Y., Ashizawa, N., Yean, S., Yeo, C.K., Yanai, N.: Self-organizing map assisted deep autoencoding gaussian mixture model for intrusion detection. In: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. IEEE (2021)
Chen, Y., Ashizawa, N., Yeo, C.K., Yanai, N., Yean, S.: Multi-scale self-organizing map assisted deep autoencoding gaussian mixture model for unsupervised intrusion detection. Knowledge-Based Systems p. 107086 (2021)
Chen, Y., Zhang, J., Yeo, C.K.: Network anomaly detection using federated deep autoencoding gaussian mixture model. In: International Conference on Machine Learning for Networking, pp. 1–14. Springer (2019)
Chen, Z., He, K., Li, J., Geng, Y.: Seq2img: A sequence-to-image based approach towards ip traffic classification using convolutional neural networks. In: 2017 IEEE International Conference on Big Data (big data), pp. 1271–1276. IEEE (2017)
Corchado, E., Herrero, Á.: Neural visualization of network traffic data for intrusion detection. Applied Soft Computing 11(2), 2042–2056 (2011)
Article Google Scholar
Draper-Gil, G., Lashkari, A.H., Mamun, M.S.I., Ghorbani, A.A.: Characterization of encrypted and vpn traffic using time-related. In: Proceedings of the 2nd International Conference on Information Systems Security and Privacy, pp. 407–414 (2016)
Elkhadir, Z., Chougdali, K., Benattou, M.: Intrusion detection system using pca and kernel pca methods. In: Proceedings of the Mediterranean Conference on Information & Communication Technologies 2015, pp. 489–497. Springer (2016)
Ferreira, D.C., Vázquez, F.I., Zseby, T.: Extreme dimensionality reduction for network attack visualization with autoencoders. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2019)
George, A., Vidyapeetham, A.: Anomaly detection based on machine learning dimensionality reduction using pca and classification using svm. International Journal of Computer Applications 47(21), 5–8 (2012)
Article Google Scholar
Haiyan, W., Haomin, Y., Xueming, L., Haijun, R.: Semi-supervised autoencoder: A joint approach of representation and classification. In: 2015 International Conference on Computational Intelligence and Communication Networks (CICN), pp. 1424–1430. IEEE (2015)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Hyvärinen, A., Oja, E.: Independent component analysis: Algorithms and applications. Neural Networks 13(4–5), 411–430 (2000)
Article Google Scholar
Ikram, S.T., Cherukuri, A.K.: Improving accuracy of intrusion detection model using pca and optimized svm. Journal of Computing and Information Technology 24(2), 133–148 (2016)
Article Google Scholar
Imran, H.M., Abdullah, A.B., Hussain, M., Palaniappan, S., Ahmad, I.: Intrusions detection based on optimum features subset and efficient dataset selection. International Journal of Engineering and Innovative Technology 2(6), 265–270 (2012)
Google Scholar
Javaid, A., Niyaz, Q., Sun, W., Alam, M.: A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), pp. 21–26 (2016)
Ji, S.Y., Jeong, B.K., Choi, S., Jeong, D.H.: A multi-level intrusion detection method for abnormal network behaviors. Journal of Network and Computer Applications 62, 9–17 (2016)
Article Google Scholar
Kaiser, H.F.: The varimax criterion for analytic rotation in factor analysis. Psychometrika 23(3), 187–200 (1958)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Korczyński, M., Duda, A.: Markov chain fingerprinting to classify encrypted traffic. In: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 781–789. IEEE (2014)
Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A.: Characterization of tor traffic using time based features. In: International Conference on Information Systems Security and Privacy (ICISSP), pp. 253–262 (2017)
Liu, C., Cao, Z., Xiong, G., Gou, G., Yiu, S.M., He, L.: Mampf: Encrypted traffic classification based on multi-attribute markov probability fingerprints. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE (2018)
Liu, C., He, L., Xiong, G., Cao, Z., Li, Z.: Fs-net: A flow sequence network for encrypted traffic classification. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1171–1179. IEEE (2019)
McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Transactions on Information and System Security (TISSEC) 3(4), 262–294 (2000)
Article Google Scholar
Pan, W., Cheng, G., Tang, Y.: Wenc: Https encrypted traffic classification using weighted ensemble learning and markov chain. In: 2017 IEEE Trustcom/BigDataSE/ICESS, pp. 50–57 (2017). https://doi.org/10.1109/Trustcom/BigDataSE/ICESS.2017.219
Ruan, Z., Miao, Y., Pan, L., Patterson, N., Zhang, J.: Visualization of big data security: A case study on the kdd99 cup data set. Digital Communications and Networks 3(4), 250–259 (2017)
Article Google Scholar
Santos, A.C.F., da Silva, J.D.S., de Sá Silva, L., da Costa Sene, M.P.: Network traffic characterization based on time series analysis and computational intelligence. J. Computational Interdisciplinary Sciences 2(3), 197–205 (2011)
Google Scholar
Sathya, S.S., Ramani, R.G., Sivaselvi, K.: Discriminant analysis based feature selection in kdd intrusion dataset. International Journal of computer applications 31(11), 1–7 (2011)
Google Scholar
Shapira, T., Shavitt, Y.: Flowpic: Encrypted internet traffic classification is as easy as image recognition. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 680–687. IEEE (2019)
Shen, M., Wei, M., Zhu, L., Wang, M.: Classification of encrypted traffic with second-order markov chains and application attribute bigrams. IEEE Transactions on Information Forensics and Security 12(8), 1830–1843 (2017)
Article Google Scholar
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009)
Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Computation 11(2), 443–482 (1999)
Article Google Scholar
Wang, W., Zhu, M., Wang, J., Zeng, X., Yang, Z.: End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 43–48. IEEE (2017)
Wang, W., Zhu, M., Zeng, X., Ye, X., Sheng, Y.: Malware traffic classification using convolutional neural network for representation learning. In: 2017 International Conference on Information Networking (ICOIN), pp. 712–717. IEEE (2017)
Waskle, S., Parashar, L., Singh, U.: Intrusion detection system using pca with random forest approach. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 803–808. IEEE (2020)
Xu, X., Wang, X.: An adaptive network intrusion detection method based on pca and support vector machines. In: International Conference on Advanced Data Mining and Applications, pp. 696–703. Springer (2005)
Yao, R., Liu, C., Zhang, L., Peng, P.: Unsupervised anomaly detection using variational auto-encoder based feature extraction. In: 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), pp. 1–7. IEEE (2019)
Yousefi-Azar, M., Varadharajan, V., Hamey, L., Tupakula, U.: Autoencoder-based feature learning for cyber security applications. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3854–3861. IEEE (2017)
Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, H.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Zong, W., Chow, Y.W., Susilo, W.: A 3d approach for the visualization of network intrusion detection data. In: 2018 International Conference on Cyberworlds (CW), pp. 308–315. IEEE (2018)

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2018YFB0204301, No. 2018YFB0805004), National Nature Science Foundation of China (No.62072466, No.U1811462) and the NUDT Grants (No. ZK19-38). Dr Xuyn Zhang is the recipient of an ARC DECRA (project No. DE210101458) funded by the Australian Government.

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, China
Luming Yang, Shaojing Fu & Yongjun Wang
National Research Center for Information Technology Security, Beijing, China
Xuyun Zhang
Macquarie University, Sydney, Australia
Shize Guo
Huazhong University of Science and Technology, Beijing, China
Chi Yang

Authors

Luming Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shaojing Fu
View author publications
You can also search for this author in PubMed Google Scholar
Xuyun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shize Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yongjun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaojing Fu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

This article belongs to the Topical Collection: Special Issue on Resource Management at the Edge for Future Web, Mobile and IoT Applications

Guest Editors: Qiang He, Fang Dong, Chenshu Wu, and Yun Yang

Appendix A: The decomposer configuration

Our experiments are performed on the following configuration: Intel(R) Core(TM) i7-7700HQ CPU @ 2.8GHz, NVIDA GeForce GTX 1050 Ti, 32GB of RAM.

All the neural network structures are implemented by tensorflow [1] (version 2.4.1) and trained by Adam optimization algorithm [20] with learning rate 0.001. The number of training epochs is 100, and the batch size is set as 256. For the loss function mentioned in (11), the weight adjustment parameter $\alpha$ is set as 0.5.

For NSL-KDD dataset, we designed a decomposer based on semi-supervised AutoEncoder for FlowSpectrum. The Encoder runs with

$$\begin{aligned}&FC(121,64,tanh)-FC(64,32,tanh)-FC(32,16,tanh)\nonumber \\&- FC(16,8,tanh)-FC(8,2) \end{aligned}$$

(A1)

and the Decoder runs with

$$\begin{aligned}&FC(2,8,tanh)-FC(8,16,tanh)-FC(16,32,tanh)\nonumber \\&- FC(32,64,tanh)-FC(64,121) \end{aligned}$$

(A2)

The classifier network performs with FC(1, 5, softmax). As for the AutoEncoder we used in this paper, its Encoder and Decoder are the same as those in semi-supervised AutoEncoder.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, L., Fu, S., Zhang, X. et al. FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection. World Wide Web 25, 2139–2161 (2022). https://doi.org/10.1007/s11280-022-01057-8

Download citation

Received: 08 September 2021
Revised: 21 December 2021
Accepted: 13 April 2022
Published: 28 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11280-022-01057-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection

Abstract

Access this article

Similar content being viewed by others

Multiscale Internet Statistics: Unveiling the Hidden Behavior

Characterizing Network Flows for Detecting DNS, NTP, and SNMP Anomalies

Uncovering network traffic anomalies based on their sparse distributions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Appendix A: The decomposer configuration

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FlowSpectrum: a concrete characterization scheme of network traffic behavior for anomaly detection

Abstract

Access this article

Similar content being viewed by others

Multiscale Internet Statistics: Unveiling the Hidden Behavior

Characterizing Network Flows for Detecting DNS, NTP, and SNMP Anomalies

Uncovering network traffic anomalies based on their sparse distributions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Appendix A: The decomposer configuration

Appendix A: The decomposer configuration

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation