Skip to main content

Advertisement

Log in

MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

Industrial control protocols play a pivotal role in facilitating communication within industrial control systems, and their security is directly intertwined with the overall communication security of the system. Traditional methods that rely on static test cases for fuzzing fail to effectively consider the changing environment to dynamically adjust the strategy for generating test cases. They also struggle to jointly extract the structural and temporal characteristics of industrial control protocols. Consequently, they suffer from the issue of ineffective test cases, resulting in limited ability to discover protocol vulnerabilities. We propose a novel approach called MARLFuzz, which is a multi-agent reinforcement learning-based fuzzing method designed for industrial control protocols. MARLFuzz incorporates a cooperative relationship-based multi-agent reinforcement learning mechanism that guides a fuzzing multi-agent array. This approach aims to achieve efficient and scalable fuzzing of the target protocol under examination. The proposed method begins with message sampling and data preprocessing. Subsequently, a reinforcement learning-based fuzzing test multi-agent array is constructed, along with its corresponding action set. A policy network based on recurrent neural networks is employed to learn temporal and spatial features of messages, while a value network, also based on recurrent neural networks, assists in central training of the multi-agent array. Finally, the decentralized fuzzing is carried out by the array of fuzzing agents. Experimental results conducted on Modbus-TCP and EtherCAT protocols demonstrate that our approach exhibits high effectiveness in generating test cases and efficiently triggering exceptions. It showcases the ability to customize the framework for different target protocols and exhibits strong scalability. The experiments indicate that the test cases of MARLFuzz achieved increases of 10.39% in effective identification rate, 38.68% in the number of anomaly triggers, and 61.87% in anomaly trigger efficiency compared to the best methods in the control group. Furthermore, there was a reduction of 37.96% in the average interval between anomaly triggers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Ghobakhloo M (2020) Industry 4.0, digitization, and opportunities for sustainability. J Clean Prod 252:119869

    Article  MATH  Google Scholar 

  2. Bhamare D, Zolanvari M, Erbad A, Jain R, Khan K, Meskin N (2020) Cybersecurity for industrial control systems: a survey. Comput Secur 89:101677

    Article  Google Scholar 

  3. Imtiaz K, Arshad MJ (2019) Security challenges of industrial communication protocols: threats vulnerabilities and solutions. Int J Comput Sci Telecommun 10(4):13–20

    MATH  Google Scholar 

  4. Zhu X, Wen S, Camtepe S, Xiang Y (2022) Fuzzing: a survey for roadmap. ACM Comput Surv (CSUR) 54(11s):1–36

    Article  MATH  Google Scholar 

  5. Böhme M, Cadar C, Roychoudhury A (2020) Fuzzing: challenges and reflections. IEEE Softw 38(3):79–86

    Article  MATH  Google Scholar 

  6. Kim S, Cho J, Lee C, Shon T (2020) Smart seed selection-based effective black box fuzzing for iiot protocol. J Supercomput 76:10140–10154

    Article  MATH  Google Scholar 

  7. Yu Z, Wang H, Wang D, Li Z, Song H (2022) Cgfuzzer: a fuzzing approach based on coverage-guided generative adversarial networks for industrial IoT protocols. IEEE Internet of Things J 9(21):21607–21619

    Article  Google Scholar 

  8. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38

    Article  Google Scholar 

  9. Miller BP, Fredriksen L, So B (1990) An empirical study of the reliability of UNIX utilities. Commun ACM 33(12):32–44

    Article  MATH  Google Scholar 

  10. Li J, Zhao B, Zhang C (2018) Fuzzing: a survey. Cybersecurity 1(1):1–13

    Article  MATH  Google Scholar 

  11. Utting M, Pretschner A, Legeard B (2012) A taxonomy of model-based testing approaches. Softw Test, Verif Reliab 22(5):297–312

    Article  MATH  Google Scholar 

  12. Peroli M, De Meo F, Viganò L, Guardini D (2018) Mobster: a model-based security testing framework for web applications. Softw Test, Verif Reliab 28(8):1685

    Article  Google Scholar 

  13. Guo T, Zhang P, Wang X, Wei Q (2013) Gramfuzz: Fuzzing testing of web browsers based on grammar analysis and structural mutation. In: 2013 Second International Conference on Informatics & Applications (ICIA), pp. 212–215. IEEE

  14. Hodován R, Kiss Á, Gyimóthy T (2018) Grammarinator: a grammar-based open source fuzzer. In: Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, pp. 45–48

  15. Pratama M, Dimla E, Lai CY, Lughofer E (2019) Metacognitive learning approach for online tool condition monitoring. J Intell Manuf 30:1717–1737

    Article  MATH  Google Scholar 

  16. Aitel D (2002) The advantages of block-based protocol analysis for security testing. Immunity Inc.

    Google Scholar 

  17. Devarajan G (2007) Unraveling scada protocols: Using sulley fuzzer. In: Defon 15 Hacking Conference

  18. Chockalingam V, Larson I, Lin D, Nofzinger S (2016) Detecting attacks on the can protocol with machine learning. Annu EECS 558:7

    Google Scholar 

  19. Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270

    Article  MathSciNet  MATH  Google Scholar 

  20. Karim F, Majumdar S, Darabi H, Chen S (2017) LSTM fully convolutional networks for time series classification. IEEE access 6:1662–1669

    Article  MATH  Google Scholar 

  21. Rajpal M, Blum W, Singh R (2017) Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596

  22. Godefroid P, Peleg H, Singh R (2017) Learn & fuzz: Machine learning for input fuzzing. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 50–59. IEEE

  23. Gao Z, Dong W, Chang R, Wang Y (2022) Fw-fuzz: a code coverage-guided fuzzing framework for network protocols on firmware. Concurr Comput: Pract Exp 34(16):5756

    Article  MATH  Google Scholar 

  24. Wiering MA, Van Otterlo M (2012) Reinforcement learning. Adapt, Learn, Optim 12(3):729

    MATH  Google Scholar 

  25. Sutton RS, Barto AG et al (1999) Reinforcement learning. J Cognit Neurosci 11(1):126–134

    MATH  Google Scholar 

  26. Böttinger K, Godefroid P, Singh R (2018) Deep reinforcement fuzzing. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 116–122. IEEE

  27. Xiao L (2022) On the convergence rates of policy gradient methods. J Mach Learn Res 23(282):1–36

    MathSciNet  MATH  Google Scholar 

  28. Kumar H, Koppel A, Ribeiro A (2023) On the sample complexity of actor-critic method for reinforcement learning with function approximation. Mach Learn 112(7):2433–2467

    Article  MathSciNet  MATH  Google Scholar 

  29. Shengwei Y, Chongbin Z, Feng X, Qi X, Chong X, Lulu L (2017) Security analysis of industrial control network protocols based on peach. J Tsinghua Univ (Sci Technol) 57(1):50–54

    Google Scholar 

  30. Zhao H, Li Z, Wei H, Shi J, Huang Y (2019) Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp. 59–c67. IEEE

Download references

Funding

This work is supported by the Science and Technology Project of State Grid Corporation of China under Grant No. 5700-202219198A-1-1-ZN.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yubo Song.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Si, X., Song, Y., Sun, X. et al. MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning. Computing 107, 61 (2025). https://doi.org/10.1007/s00607-025-01421-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00607-025-01421-2

Keywords