Abstract
Machine learning models have been widely implemented to classify software. These models allow to generalize static features of Windows portable executable files. While highly accurate in terms of classification, they still exhibit weaknesses that can be exploited by applying subtle transformations to the input object. Despite their semantic-preserving nature, such transformations can render the file corrupt. Hence, unlike in the computer vision domain, integrity verification is vital to the generation of adversarial malware examples. Many approaches have been explored in the literature, however, most of them have either overestimated the semantic-preserving transformations or achieved modest evasion rates across general files. We therefore present AIMED-RL, Automatic Intelligent Malware modifications to Evade Detection using Reinforcement Learning. Our approach is able to generate adversarial examples that lead machine learning models to misclassify malware files, without compromising their functionality. We implement our approach using a Distributional Double Deep Q-Network agent, adding a penalty to improve diversity of transformations. Thereby, we achieve competitive results compared to previous research based on reinforcement learning while minimizing the required sequence of transformations.
S. Franz—Work done at Research Institute CODE while a student at LMU Munich.
This research is partially supported by EC H2020 Project CONCORDIA GA 830927.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Ucci, D., Aniello, L., Baldoni, R.: Survey of machine learning techniques for malware analysis. Comput. Secur. 81, 123–147 (2019)
Raff, E., Nicholas, C.: Survey of machine learning methods and challenges for windows malware classification. arXiv:2006.09271 (2020)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv (2013)
Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)
Labaca-Castro, R., Schmitt, C., Rodosek, G.D.: ARMED: how automatic malware modifications can evade static detection? In: 2019 5th International Conference on Information Management (ICIM), pp. 20–27 (2019)
Labaca-Castro, R., Biggio, B., Rodosek, G.D.: Poster: attacking malware classifiers by crafting gradient-attacks that preserve functionality. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2565–2567 (2019)
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. ArXiv (2017)
Castro, R.L., Schmitt, C., Rodosek, G.D.: Poster: training GANs to generate adversarial examples against malware classification. IEEE Secur. Priv. (2019)
Anderson, H.S., Kharkar, A., Filar, B., Evans, D., Roth, P.: Learning to evade static PE machine learning malware models via RL. ArXiv (2018)
Labaca-Castro, R., Schmitt, C., Rodosek, G.D.: AIMED: evolving malware with genetic programming to evade detection. In: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 240–247 (2019)
Chen, T., Liu, J., Xiang, Y., Niu, W., Tong, E., Han, Z.: Adversarial attack and defense in reinforcement learning-from AI security view. Cybersecurity 2(1), 11 (2019)
Luong, N.C., et al.: Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 21(4), 3133–3174 (2019)
Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799 (2019)
Qian, Y., Wu, J., Wang, R., Zhu, F., Zhang, W.: Survey on reinforcement learning applications in communication networks. J. Commun. Inform. Netw. 4(2), 30–39 (2019)
Brockman, G., et al.: OpenAI gym. ArXiv (2016)
Fang, Z., Wang, J., Li, B., Wu, S., Zhou, Y., Huang, H.: Evading anti-malware engines with deep reinforcement learning. IEEE Access 7, 48867–48879 (2019)
Guarnieri, C., Tanasi, A., Bremer, J., Schloesser, M.: Cuckoo sandbox - automated malware analysis. Cuckoo (2021)
Fang, Y., Zeng, Y., Li, B., Liu, L., Zhang, L.: DeepDetectNet vs RLAttackNet: an adversarial method to improve deep learning-based static malware detection model. PLOS One 15(4), e0231626 (2020)
VirusTotal. Analyze suspicious files and URLs to detect types of malware, automatically share them with the security community (2021). https://virustotal.com. Accessed 25 Feb 2021
Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Functionality-preserving black-box optimization of adversarial windows malware. ArXiv (2020)
Christopher, J.C.H.: Watkins and Peter Dayan. Q-learning. Mach. Learn. 8(1992), 279–292 (1992)
Mnih, V., et al.: Playing atari with deep reinforcement learning. ArXiv (2013)
Carlini, N., et al.: On evaluating adversarial robustness. CoRR, abs/1902.06705 (2019)
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 3146–3154. Curran Associates Inc. (2017)
Quarkslab: LIEF: library to instrument executable formats. QuarksLab (2020)
Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. ArXiv (2015)
Oberhumer, M.F.X.J., Molnár, L., Reiser, J.F.: UPX: the ultimate packer for executables - homepage. GitHub (2020)
Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 3215–3222 (2018)
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. ArXiv, 21 July 2017
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ArXiv (2014)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. ArXiv (2015)
Fortunato, M., et al.: Noisy networks for exploration. In: Proceedings of the International Conference on Representation Learning (ICLR 2018), Vancouver, Canada (2018)
VirusShare. VirusShare: a repository of malware samples for security researchers (2021). https://virusshare.com. Accessed 12 Mar 2021
Hex-Rays. IDA Pro: A powerful disassembler and a versatile debugger (2021). https://www.hex-rays.com/products/ida/. Accessed 29 Mar 2021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Labaca-Castro, R., Franz, S., Rodosek, G.D. (2021). AIMED-RL: Exploring Adversarial Malware Examples with Reinforcement Learning. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-86514-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86513-9
Online ISBN: 978-3-030-86514-6
eBook Packages: Computer ScienceComputer Science (R0)