MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning

Si, Xiaokai; Song, Yubo; Sun, Xin; Wang, Wen; Qin, Zhongyuan

doi:10.1007/s00607-025-01421-2

MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning

Regular Paper
Published: 23 January 2025

Volume 107, article number 61, (2025)
Cite this article

Computing Aims and scope Submit manuscript

Xiaokai Si^1,2,
Yubo Song ORCID: orcid.org/0000-0002-1347-6126^1,2,
Xin Sun³,
Wen Wang⁴ &
…
Zhongyuan Qin^1,2

218 Accesses
Explore all metrics

Abstract

Industrial control protocols play a pivotal role in facilitating communication within industrial control systems, and their security is directly intertwined with the overall communication security of the system. Traditional methods that rely on static test cases for fuzzing fail to effectively consider the changing environment to dynamically adjust the strategy for generating test cases. They also struggle to jointly extract the structural and temporal characteristics of industrial control protocols. Consequently, they suffer from the issue of ineffective test cases, resulting in limited ability to discover protocol vulnerabilities. We propose a novel approach called MARLFuzz, which is a multi-agent reinforcement learning-based fuzzing method designed for industrial control protocols. MARLFuzz incorporates a cooperative relationship-based multi-agent reinforcement learning mechanism that guides a fuzzing multi-agent array. This approach aims to achieve efficient and scalable fuzzing of the target protocol under examination. The proposed method begins with message sampling and data preprocessing. Subsequently, a reinforcement learning-based fuzzing test multi-agent array is constructed, along with its corresponding action set. A policy network based on recurrent neural networks is employed to learn temporal and spatial features of messages, while a value network, also based on recurrent neural networks, assists in central training of the multi-agent array. Finally, the decentralized fuzzing is carried out by the array of fuzzing agents. Experimental results conducted on Modbus-TCP and EtherCAT protocols demonstrate that our approach exhibits high effectiveness in generating test cases and efficiently triggering exceptions. It showcases the ability to customize the framework for different target protocols and exhibits strong scalability. The experiments indicate that the test cases of MARLFuzz achieved increases of 10.39% in effective identification rate, 38.68% in the number of anomaly triggers, and 61.87% in anomaly trigger efficiency compared to the best methods in the control group. Furthermore, there was a reduction of 37.96% in the average interval between anomaly triggers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on Optimization of Fuzzing Test of Unknown Protocol Based on Message Type

ICPFuzzer: proprietary communication protocol fuzzing by using machine learning and feedback strategies

Article Open access 03 August 2021

A novel approach detection for IIoT attacks via artificial intelligence

Article Open access 06 May 2024

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Ghobakhloo M (2020) Industry 4.0, digitization, and opportunities for sustainability. J Clean Prod 252:119869
Article MATH Google Scholar
Bhamare D, Zolanvari M, Erbad A, Jain R, Khan K, Meskin N (2020) Cybersecurity for industrial control systems: a survey. Comput Secur 89:101677
Article Google Scholar
Imtiaz K, Arshad MJ (2019) Security challenges of industrial communication protocols: threats vulnerabilities and solutions. Int J Comput Sci Telecommun 10(4):13–20
MATH Google Scholar
Zhu X, Wen S, Camtepe S, Xiang Y (2022) Fuzzing: a survey for roadmap. ACM Comput Surv (CSUR) 54(11s):1–36
Article MATH Google Scholar
Böhme M, Cadar C, Roychoudhury A (2020) Fuzzing: challenges and reflections. IEEE Softw 38(3):79–86
Article MATH Google Scholar
Kim S, Cho J, Lee C, Shon T (2020) Smart seed selection-based effective black box fuzzing for iiot protocol. J Supercomput 76:10140–10154
Article MATH Google Scholar
Yu Z, Wang H, Wang D, Li Z, Song H (2022) Cgfuzzer: a fuzzing approach based on coverage-guided generative adversarial networks for industrial IoT protocols. IEEE Internet of Things J 9(21):21607–21619
Article Google Scholar
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38
Article Google Scholar
Miller BP, Fredriksen L, So B (1990) An empirical study of the reliability of UNIX utilities. Commun ACM 33(12):32–44
Article MATH Google Scholar
Li J, Zhao B, Zhang C (2018) Fuzzing: a survey. Cybersecurity 1(1):1–13
Article MATH Google Scholar
Utting M, Pretschner A, Legeard B (2012) A taxonomy of model-based testing approaches. Softw Test, Verif Reliab 22(5):297–312
Article MATH Google Scholar
Peroli M, De Meo F, Viganò L, Guardini D (2018) Mobster: a model-based security testing framework for web applications. Softw Test, Verif Reliab 28(8):1685
Article Google Scholar
Guo T, Zhang P, Wang X, Wei Q (2013) Gramfuzz: Fuzzing testing of web browsers based on grammar analysis and structural mutation. In: 2013 Second International Conference on Informatics & Applications (ICIA), pp. 212–215. IEEE
Hodován R, Kiss Á, Gyimóthy T (2018) Grammarinator: a grammar-based open source fuzzer. In: Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, pp. 45–48
Pratama M, Dimla E, Lai CY, Lughofer E (2019) Metacognitive learning approach for online tool condition monitoring. J Intell Manuf 30:1717–1737
Article MATH Google Scholar
Aitel D (2002) The advantages of block-based protocol analysis for security testing. Immunity Inc.
Google Scholar
Devarajan G (2007) Unraveling scada protocols: Using sulley fuzzer. In: Defon 15 Hacking Conference
Chockalingam V, Larson I, Lin D, Nofzinger S (2016) Detecting attacks on the can protocol with machine learning. Annu EECS 558:7
Google Scholar
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Article MathSciNet MATH Google Scholar
Karim F, Majumdar S, Darabi H, Chen S (2017) LSTM fully convolutional networks for time series classification. IEEE access 6:1662–1669
Article MATH Google Scholar
Rajpal M, Blum W, Singh R (2017) Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596
Godefroid P, Peleg H, Singh R (2017) Learn & fuzz: Machine learning for input fuzzing. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 50–59. IEEE
Gao Z, Dong W, Chang R, Wang Y (2022) Fw-fuzz: a code coverage-guided fuzzing framework for network protocols on firmware. Concurr Comput: Pract Exp 34(16):5756
Article MATH Google Scholar
Wiering MA, Van Otterlo M (2012) Reinforcement learning. Adapt, Learn, Optim 12(3):729
MATH Google Scholar
Sutton RS, Barto AG et al (1999) Reinforcement learning. J Cognit Neurosci 11(1):126–134
MATH Google Scholar
Böttinger K, Godefroid P, Singh R (2018) Deep reinforcement fuzzing. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 116–122. IEEE
Xiao L (2022) On the convergence rates of policy gradient methods. J Mach Learn Res 23(282):1–36
MathSciNet MATH Google Scholar
Kumar H, Koppel A, Ribeiro A (2023) On the sample complexity of actor-critic method for reinforcement learning with function approximation. Mach Learn 112(7):2433–2467
Article MathSciNet MATH Google Scholar
Shengwei Y, Chongbin Z, Feng X, Qi X, Chong X, Lulu L (2017) Security analysis of industrial control network protocols based on peach. J Tsinghua Univ (Sci Technol) 57(1):50–54
Google Scholar
Zhao H, Li Z, Wei H, Shi J, Huang Y (2019) Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp. 59–c67. IEEE

Download references

Funding

This work is supported by the Science and Technology Project of State Grid Corporation of China under Grant No. 5700-202219198A-1-1-ZN.

Author information

Authors and Affiliations

School of Cyber Science and Engineering, Southeast University, Nanjing, 211189, China
Xiaokai Si, Yubo Song & Zhongyuan Qin
Purple Mountain Laboratories, Nanjing, 211189, China
Xiaokai Si, Yubo Song & Zhongyuan Qin
Internet Technology Center, State Grid Zhejiang Electric Power Co., Ltd., Hangzhou, China
Xin Sun
Internet Department, State Grid Zhejiang Electric Power Co., Ltd., Hangzhou, China
Wen Wang

Authors

Xiaokai Si
View author publications
You can also search for this author inPubMed Google Scholar
Yubo Song
View author publications
You can also search for this author inPubMed Google Scholar
Xin Sun
View author publications
You can also search for this author inPubMed Google Scholar
Wen Wang
View author publications
You can also search for this author inPubMed Google Scholar
Zhongyuan Qin
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yubo Song.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Si, X., Song, Y., Sun, X. et al. MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning. Computing 107, 61 (2025). https://doi.org/10.1007/s00607-025-01421-2

Download citation

Received: 21 June 2023
Accepted: 16 January 2025
Published: 23 January 2025
DOI: https://doi.org/10.1007/s00607-025-01421-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MARLFuzz: industrial control protocols fuzzing based on multi-agent reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Optimization of Fuzzing Test of Unknown Protocol Based on Message Type

ICPFuzzer: proprietary communication protocol fuzzing by using machine learning and feedback strategies

A novel approach detection for IIoT attacks via artificial intelligence

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now