Abstract
Industrial control protocols play a pivotal role in facilitating communication within industrial control systems, and their security is directly intertwined with the overall communication security of the system. Traditional methods that rely on static test cases for fuzzing fail to effectively consider the changing environment to dynamically adjust the strategy for generating test cases. They also struggle to jointly extract the structural and temporal characteristics of industrial control protocols. Consequently, they suffer from the issue of ineffective test cases, resulting in limited ability to discover protocol vulnerabilities. We propose a novel approach called MARLFuzz, which is a multi-agent reinforcement learning-based fuzzing method designed for industrial control protocols. MARLFuzz incorporates a cooperative relationship-based multi-agent reinforcement learning mechanism that guides a fuzzing multi-agent array. This approach aims to achieve efficient and scalable fuzzing of the target protocol under examination. The proposed method begins with message sampling and data preprocessing. Subsequently, a reinforcement learning-based fuzzing test multi-agent array is constructed, along with its corresponding action set. A policy network based on recurrent neural networks is employed to learn temporal and spatial features of messages, while a value network, also based on recurrent neural networks, assists in central training of the multi-agent array. Finally, the decentralized fuzzing is carried out by the array of fuzzing agents. Experimental results conducted on Modbus-TCP and EtherCAT protocols demonstrate that our approach exhibits high effectiveness in generating test cases and efficiently triggering exceptions. It showcases the ability to customize the framework for different target protocols and exhibits strong scalability. The experiments indicate that the test cases of MARLFuzz achieved increases of 10.39% in effective identification rate, 38.68% in the number of anomaly triggers, and 61.87% in anomaly trigger efficiency compared to the best methods in the control group. Furthermore, there was a reduction of 37.96% in the average interval between anomaly triggers.













Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Ghobakhloo M (2020) Industry 4.0, digitization, and opportunities for sustainability. J Clean Prod 252:119869
Bhamare D, Zolanvari M, Erbad A, Jain R, Khan K, Meskin N (2020) Cybersecurity for industrial control systems: a survey. Comput Secur 89:101677
Imtiaz K, Arshad MJ (2019) Security challenges of industrial communication protocols: threats vulnerabilities and solutions. Int J Comput Sci Telecommun 10(4):13–20
Zhu X, Wen S, Camtepe S, Xiang Y (2022) Fuzzing: a survey for roadmap. ACM Comput Surv (CSUR) 54(11s):1–36
Böhme M, Cadar C, Roychoudhury A (2020) Fuzzing: challenges and reflections. IEEE Softw 38(3):79–86
Kim S, Cho J, Lee C, Shon T (2020) Smart seed selection-based effective black box fuzzing for iiot protocol. J Supercomput 76:10140–10154
Yu Z, Wang H, Wang D, Li Z, Song H (2022) Cgfuzzer: a fuzzing approach based on coverage-guided generative adversarial networks for industrial IoT protocols. IEEE Internet of Things J 9(21):21607–21619
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38
Miller BP, Fredriksen L, So B (1990) An empirical study of the reliability of UNIX utilities. Commun ACM 33(12):32–44
Li J, Zhao B, Zhang C (2018) Fuzzing: a survey. Cybersecurity 1(1):1–13
Utting M, Pretschner A, Legeard B (2012) A taxonomy of model-based testing approaches. Softw Test, Verif Reliab 22(5):297–312
Peroli M, De Meo F, Viganò L, Guardini D (2018) Mobster: a model-based security testing framework for web applications. Softw Test, Verif Reliab 28(8):1685
Guo T, Zhang P, Wang X, Wei Q (2013) Gramfuzz: Fuzzing testing of web browsers based on grammar analysis and structural mutation. In: 2013 Second International Conference on Informatics & Applications (ICIA), pp. 212–215. IEEE
Hodován R, Kiss Á, Gyimóthy T (2018) Grammarinator: a grammar-based open source fuzzer. In: Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, pp. 45–48
Pratama M, Dimla E, Lai CY, Lughofer E (2019) Metacognitive learning approach for online tool condition monitoring. J Intell Manuf 30:1717–1737
Aitel D (2002) The advantages of block-based protocol analysis for security testing. Immunity Inc.
Devarajan G (2007) Unraveling scada protocols: Using sulley fuzzer. In: Defon 15 Hacking Conference
Chockalingam V, Larson I, Lin D, Nofzinger S (2016) Detecting attacks on the can protocol with machine learning. Annu EECS 558:7
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Karim F, Majumdar S, Darabi H, Chen S (2017) LSTM fully convolutional networks for time series classification. IEEE access 6:1662–1669
Rajpal M, Blum W, Singh R (2017) Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596
Godefroid P, Peleg H, Singh R (2017) Learn & fuzz: Machine learning for input fuzzing. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 50–59. IEEE
Gao Z, Dong W, Chang R, Wang Y (2022) Fw-fuzz: a code coverage-guided fuzzing framework for network protocols on firmware. Concurr Comput: Pract Exp 34(16):5756
Wiering MA, Van Otterlo M (2012) Reinforcement learning. Adapt, Learn, Optim 12(3):729
Sutton RS, Barto AG et al (1999) Reinforcement learning. J Cognit Neurosci 11(1):126–134
Böttinger K, Godefroid P, Singh R (2018) Deep reinforcement fuzzing. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 116–122. IEEE
Xiao L (2022) On the convergence rates of policy gradient methods. J Mach Learn Res 23(282):1–36
Kumar H, Koppel A, Ribeiro A (2023) On the sample complexity of actor-critic method for reinforcement learning with function approximation. Mach Learn 112(7):2433–2467
Shengwei Y, Chongbin Z, Feng X, Qi X, Chong X, Lulu L (2017) Security analysis of industrial control network protocols based on peach. J Tsinghua Univ (Sci Technol) 57(1):50–54
Zhao H, Li Z, Wei H, Shi J, Huang Y (2019) Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp. 59–c67. IEEE
Funding
This work is supported by the Science and Technology Project of State Grid Corporation of China under Grant No. 5700-202219198A-1-1-ZN.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Si, X., Song, Y., Sun, X. et al. MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning. Computing 107, 61 (2025). https://doi.org/10.1007/s00607-025-01421-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00607-025-01421-2