Skip to main content
Log in

PSO-based feature extraction of unknown protocol data frame

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

In today’s network information confrontation, due to security reasons, the protocols used by both parties are often undisclosed and the protocol format is unknown, and the communication data is in the form of the continuous and irregular bitstream. How to extract features without prior knowledge is an urgent problem to be solved. Therefore, this study proposes a method for the feature extraction of unknown protocol data frames based on the particle swarm optimization (PSO) algorithm to address the problem of low adaptability and low accuracy of frequent thresholds. Given the features of the bitstream data frames, the proposed method segments the bitstream data through Zipf’s law. The PSO algorithm is employed to adapt the frequent threshold to the uncertainty of the unknown protocols, and the short frequent sequence is then obtained under the adaptive threshold. The continuous location information is then applied to splice the excavated short frequent sequences to determine the final frequent sequence set. To filter out the effective association rules, the chi-squared test is conducted to analyze the association rules mined between frequent sequences. According to the simulation results, the proposed method managed to achieve the frequent extraction of adaptive thresholds in different datasets, whereas its accuracy was higher than that of the comparison algorithm. Moreover, the method proposed in this paper has certain practical significance for theoretical research and application in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The data used to support the findings of this study are included within the article.

Code Availability

The author declares that the code is available.

References

  1. Hei X, Bai B, Wang Y, Zhang L, Zhu L, Ji W (2019) Feature extraction optimization for bitstream communication protocol format reverse analysis. In: 2019 18th IEEE international conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE)

  2. Cai L, Shi R, Xu D (2017) Communication protocol identification based on data mining and automatic reasoning. In: 2017 IEEE 2nd international conference on big data analysis (ICBDA)

  3. Wang W, Bai B, Wang Y, Hei X, Zhang L (2019) Bitstream protocol classification mechanism based on feature extraction. In: 2019 International conference on networking and network applications (NaNA)

  4. Goo YH, Shim KS, Chae BM, Kim MS (2018) Framework for precise protocol reverse engineering based on network traces. In: NOMS 2018 - 2018 IEEE/IFIP network operations and management symposium

  5. Yang X, Qiang W, Yi W (2017) Feature selection based on network maximal correlation. In: 2017 20th international symposium on wireless personal multimedia communications (WPMC)

  6. Aparna UR, Paul S (2016) Feature selection and extraction in data mining. In: 2016 Online international conference on green engineering and technologies (IC-GET)

  7. Maza S, Zouache D (2019) Binary firefly algorithm for feature selection in classification. In: 2019 international conference on theoretical and applicative aspects of computer science (ICTAACS)

  8. Ju Y, Xie S, Wei Z (2014) Identification of data fingerprint characteristics based on self-adaptive weights. Computer Measurement and Control

  9. Feddaoui I, Felhi F, Akaichi J (2016) Extract: new extraction algorithm of association rules from frequent itemsets. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 752–756. IEEE

  10. Ye G, Tang XL (2015) Application of hybrid ant colony algorithm for mining maximum frequent item sets. In: ieee international conference on signal processing

  11. Li H, Zhang B, Bo S, Jian W, Tang C (2016) Automatic protocol feature word construction based on machine learning. In: IEEE international conference on progress in informatics and computing

  12. Hei X, Bai B, Wang Y, Zhang L, Zhu L, Ji W (2019) Feature extraction optimization for bitstream communication protocol format reverse analysis. In: 2019 18th IEEE international conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE)

  13. Li Y, Hong Z, Feng W, Wu L (2019) A hierarchical clustering based feature word extraction method. In: 2019 IEEE 3rd advanced information management, communicates, electronic and automation control conference (IMCEC), pp 883–887. IEEE

  14. Lei Y, Cao C (2020) Frame segmentation in the link layer bit stream data based on directed graph. In: 2020 12th international conference on communication software and networks (ICCSN)

  15. Song Z, Wu B (2020) Anomaly detection based on feature extraction of unknown protocol payload format. In: 2020 IEEE 5th information technology and mechatronics engineering conference (ITOEC)

  16. Zhiguo L, Wenzhu C (2021) A feature extraction method for unknown wireless protocol based on statistical analysis. Comput Eng 47(11):6

    Google Scholar 

  17. Mahmood MA, Hasan KA (2019) Efficient compression scheme for large natural text using zipf distribution. In: 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT), pp 1–6. IEEE

  18. Li Z. Research on feature extraction and recognition of bitstream protocol

  19. Li G, Chen Y, Cao D, Qu X, Li K (2021) Extraction of descriptive driving patterns from driving data using unsupervised algorithms. Mech Syst Signal Process 156(11):107589

    Article  Google Scholar 

  20. Agrawal M, Mishra M, Kushwah S (2015) Association rules optimization using improved pso algorithm. In: International conference on communication networks

  21. Veeramanikandan V, Jeyakarthic M (2019) A futuristic framework for financial credit score prediction system using pso based feature selection with random tree data classification model. In: 2019 International conference on smart systems and inventive technology (ICSSIT)

  22. Vyas P, Chauhan A. Comparative optimization of efficient association rule mining through pso and ga. In: International Conference on Machine Intelligence Research and Advancement

  23. Qianyi Z, Qian Q, Yunfa F, Yong F. Summary of research on particle swarm optimization algorithm in association rule mining. Computer Science and Exploration

  24. Hao F, Liao R, Liu F, Wang Y, Zhu X (2018) Optimization algorithm improvement of association rule mining based on particle swarm optimization. In: International conference on measuring technology and mechatronics automation

  25. Fen L, Tong L, Chun-Rui Z, Yong W, Jiang S (2012) [ieee 2012 eighth international conference on computational intelligence and security (cis) - guangzhou, china (2012.11.17-2012.11.18)] 2012 eighth international conference on computational intelligence and security - length identification of unknown data, pp 674–677

  26. Liu Z, Yu Y (2019) An autonomous selection method for spatial pooling based on chi-square test. In: 2018 IEEE international conference on information and automation (ICIA)

Download references

Acknowledgements

The authors would like to express their gratitude to EditSprings (https://www.editsprings.com/) for the expert linguistic services provided.

Funding

This work is not supported by the fund.

Author information

Authors and Affiliations

Authors

Contributions

All authors meet the signing conditions.

Corresponding author

Correspondence to Zhiguo Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Ethics approval

The full text or part of the paper has not been submitted or published elsewhere.

Consent to participate

All the authors listed in the paper agree to sign and publish.

Consent for publication

Agree to publish this paper in Computing Journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jiaojiao Zhang, Lin Wang, Jianxin Feng, Yuanming Ding and ChangQing Ren have contributed equally to this work.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Zhang, J., Wang, L. et al. PSO-based feature extraction of unknown protocol data frame. Computing 105, 131–149 (2023). https://doi.org/10.1007/s00607-022-01118-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-022-01118-w

Keywords

Mathematics Subject Classification

Navigation