Abstract
In today’s network information confrontation, due to security reasons, the protocols used by both parties are often undisclosed and the protocol format is unknown, and the communication data is in the form of the continuous and irregular bitstream. How to extract features without prior knowledge is an urgent problem to be solved. Therefore, this study proposes a method for the feature extraction of unknown protocol data frames based on the particle swarm optimization (PSO) algorithm to address the problem of low adaptability and low accuracy of frequent thresholds. Given the features of the bitstream data frames, the proposed method segments the bitstream data through Zipf’s law. The PSO algorithm is employed to adapt the frequent threshold to the uncertainty of the unknown protocols, and the short frequent sequence is then obtained under the adaptive threshold. The continuous location information is then applied to splice the excavated short frequent sequences to determine the final frequent sequence set. To filter out the effective association rules, the chi-squared test is conducted to analyze the association rules mined between frequent sequences. According to the simulation results, the proposed method managed to achieve the frequent extraction of adaptive thresholds in different datasets, whereas its accuracy was higher than that of the comparison algorithm. Moreover, the method proposed in this paper has certain practical significance for theoretical research and application in this field.
Similar content being viewed by others
Data availability
The data used to support the findings of this study are included within the article.
Code Availability
The author declares that the code is available.
References
Hei X, Bai B, Wang Y, Zhang L, Zhu L, Ji W (2019) Feature extraction optimization for bitstream communication protocol format reverse analysis. In: 2019 18th IEEE international conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE)
Cai L, Shi R, Xu D (2017) Communication protocol identification based on data mining and automatic reasoning. In: 2017 IEEE 2nd international conference on big data analysis (ICBDA)
Wang W, Bai B, Wang Y, Hei X, Zhang L (2019) Bitstream protocol classification mechanism based on feature extraction. In: 2019 International conference on networking and network applications (NaNA)
Goo YH, Shim KS, Chae BM, Kim MS (2018) Framework for precise protocol reverse engineering based on network traces. In: NOMS 2018 - 2018 IEEE/IFIP network operations and management symposium
Yang X, Qiang W, Yi W (2017) Feature selection based on network maximal correlation. In: 2017 20th international symposium on wireless personal multimedia communications (WPMC)
Aparna UR, Paul S (2016) Feature selection and extraction in data mining. In: 2016 Online international conference on green engineering and technologies (IC-GET)
Maza S, Zouache D (2019) Binary firefly algorithm for feature selection in classification. In: 2019 international conference on theoretical and applicative aspects of computer science (ICTAACS)
Ju Y, Xie S, Wei Z (2014) Identification of data fingerprint characteristics based on self-adaptive weights. Computer Measurement and Control
Feddaoui I, Felhi F, Akaichi J (2016) Extract: new extraction algorithm of association rules from frequent itemsets. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 752–756. IEEE
Ye G, Tang XL (2015) Application of hybrid ant colony algorithm for mining maximum frequent item sets. In: ieee international conference on signal processing
Li H, Zhang B, Bo S, Jian W, Tang C (2016) Automatic protocol feature word construction based on machine learning. In: IEEE international conference on progress in informatics and computing
Hei X, Bai B, Wang Y, Zhang L, Zhu L, Ji W (2019) Feature extraction optimization for bitstream communication protocol format reverse analysis. In: 2019 18th IEEE international conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE)
Li Y, Hong Z, Feng W, Wu L (2019) A hierarchical clustering based feature word extraction method. In: 2019 IEEE 3rd advanced information management, communicates, electronic and automation control conference (IMCEC), pp 883–887. IEEE
Lei Y, Cao C (2020) Frame segmentation in the link layer bit stream data based on directed graph. In: 2020 12th international conference on communication software and networks (ICCSN)
Song Z, Wu B (2020) Anomaly detection based on feature extraction of unknown protocol payload format. In: 2020 IEEE 5th information technology and mechatronics engineering conference (ITOEC)
Zhiguo L, Wenzhu C (2021) A feature extraction method for unknown wireless protocol based on statistical analysis. Comput Eng 47(11):6
Mahmood MA, Hasan KA (2019) Efficient compression scheme for large natural text using zipf distribution. In: 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT), pp 1–6. IEEE
Li Z. Research on feature extraction and recognition of bitstream protocol
Li G, Chen Y, Cao D, Qu X, Li K (2021) Extraction of descriptive driving patterns from driving data using unsupervised algorithms. Mech Syst Signal Process 156(11):107589
Agrawal M, Mishra M, Kushwah S (2015) Association rules optimization using improved pso algorithm. In: International conference on communication networks
Veeramanikandan V, Jeyakarthic M (2019) A futuristic framework for financial credit score prediction system using pso based feature selection with random tree data classification model. In: 2019 International conference on smart systems and inventive technology (ICSSIT)
Vyas P, Chauhan A. Comparative optimization of efficient association rule mining through pso and ga. In: International Conference on Machine Intelligence Research and Advancement
Qianyi Z, Qian Q, Yunfa F, Yong F. Summary of research on particle swarm optimization algorithm in association rule mining. Computer Science and Exploration
Hao F, Liao R, Liu F, Wang Y, Zhu X (2018) Optimization algorithm improvement of association rule mining based on particle swarm optimization. In: International conference on measuring technology and mechatronics automation
Fen L, Tong L, Chun-Rui Z, Yong W, Jiang S (2012) [ieee 2012 eighth international conference on computational intelligence and security (cis) - guangzhou, china (2012.11.17-2012.11.18)] 2012 eighth international conference on computational intelligence and security - length identification of unknown data, pp 674–677
Liu Z, Yu Y (2019) An autonomous selection method for spatial pooling based on chi-square test. In: 2018 IEEE international conference on information and automation (ICIA)
Acknowledgements
The authors would like to express their gratitude to EditSprings (https://www.editsprings.com/) for the expert linguistic services provided.
Funding
This work is not supported by the fund.
Author information
Authors and Affiliations
Contributions
All authors meet the signing conditions.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethics approval
The full text or part of the paper has not been submitted or published elsewhere.
Consent to participate
All the authors listed in the paper agree to sign and publish.
Consent for publication
Agree to publish this paper in Computing Journal.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jiaojiao Zhang, Lin Wang, Jianxin Feng, Yuanming Ding and ChangQing Ren have contributed equally to this work.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Z., Zhang, J., Wang, L. et al. PSO-based feature extraction of unknown protocol data frame. Computing 105, 131–149 (2023). https://doi.org/10.1007/s00607-022-01118-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-022-01118-w