AIMED-RL: Exploring Adversarial Malware Examples with Reinforcement Learning

Labaca-Castro, Raphael; Franz, Sebastian; Rodosek, Gabi Dreo

doi:10.1007/978-3-030-86514-6_3

AIMED-RL: Exploring Adversarial Malware Examples with Reinforcement Learning

Raphael Labaca-Castro^12,13,
Sebastian Franz¹⁴ &
Gabi Dreo Rodosek^12,13

Conference paper
First Online: 10 September 2021

2018 Accesses
10 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12978))

Abstract

Machine learning models have been widely implemented to classify software. These models allow to generalize static features of Windows portable executable files. While highly accurate in terms of classification, they still exhibit weaknesses that can be exploited by applying subtle transformations to the input object. Despite their semantic-preserving nature, such transformations can render the file corrupt. Hence, unlike in the computer vision domain, integrity verification is vital to the generation of adversarial malware examples. Many approaches have been explored in the literature, however, most of them have either overestimated the semantic-preserving transformations or achieved modest evasion rates across general files. We therefore present AIMED-RL, Automatic Intelligent Malware modifications to Evade Detection using Reinforcement Learning. Our approach is able to generate adversarial examples that lead machine learning models to misclassify malware files, without compromising their functionality. We implement our approach using a Distributional Double Deep Q-Network agent, adding a penalty to improve diversity of transformations. Thereby, we achieve competitive results compared to previous research based on reinforcement learning while minimizing the required sequence of transformations.

S. Franz—Work done at Research Institute CODE while a student at LMU Munich.

This research is partially supported by EC H2020 Project CONCORDIA GA 830927.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/zRapha/AIMED.

References

Ucci, D., Aniello, L., Baldoni, R.: Survey of machine learning techniques for malware analysis. Comput. Secur. 81, 123–147 (2019)
Article Google Scholar
Raff, E., Nicholas, C.: Survey of machine learning methods and challenges for windows malware classification. arXiv:2006.09271 (2020)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv (2013)
Google Scholar
Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)
Article Google Scholar
Labaca-Castro, R., Schmitt, C., Rodosek, G.D.: ARMED: how automatic malware modifications can evade static detection? In: 2019 5th International Conference on Information Management (ICIM), pp. 20–27 (2019)
Google Scholar
Labaca-Castro, R., Biggio, B., Rodosek, G.D.: Poster: attacking malware classifiers by crafting gradient-attacks that preserve functionality. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2565–2567 (2019)
Google Scholar
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. ArXiv (2017)
Google Scholar
Castro, R.L., Schmitt, C., Rodosek, G.D.: Poster: training GANs to generate adversarial examples against malware classification. IEEE Secur. Priv. (2019)
Google Scholar
Anderson, H.S., Kharkar, A., Filar, B., Evans, D., Roth, P.: Learning to evade static PE machine learning malware models via RL. ArXiv (2018)
Google Scholar
Labaca-Castro, R., Schmitt, C., Rodosek, G.D.: AIMED: evolving malware with genetic programming to evade detection. In: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 240–247 (2019)
Google Scholar
Chen, T., Liu, J., Xiang, Y., Niu, W., Tong, E., Han, Z.: Adversarial attack and defense in reinforcement learning-from AI security view. Cybersecurity 2(1), 11 (2019)
Article Google Scholar
Luong, N.C., et al.: Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 21(4), 3133–3174 (2019)
Article Google Scholar
Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799 (2019)
Qian, Y., Wu, J., Wang, R., Zhu, F., Zhang, W.: Survey on reinforcement learning applications in communication networks. J. Commun. Inform. Netw. 4(2), 30–39 (2019)
Google Scholar
Brockman, G., et al.: OpenAI gym. ArXiv (2016)
Google Scholar
Fang, Z., Wang, J., Li, B., Wu, S., Zhou, Y., Huang, H.: Evading anti-malware engines with deep reinforcement learning. IEEE Access 7, 48867–48879 (2019)
Article Google Scholar
Guarnieri, C., Tanasi, A., Bremer, J., Schloesser, M.: Cuckoo sandbox - automated malware analysis. Cuckoo (2021)
Google Scholar
Fang, Y., Zeng, Y., Li, B., Liu, L., Zhang, L.: DeepDetectNet vs RLAttackNet: an adversarial method to improve deep learning-based static malware detection model. PLOS One 15(4), e0231626 (2020)
Article Google Scholar
VirusTotal. Analyze suspicious files and URLs to detect types of malware, automatically share them with the security community (2021). https://virustotal.com. Accessed 25 Feb 2021
Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Functionality-preserving black-box optimization of adversarial windows malware. ArXiv (2020)
Google Scholar
Christopher, J.C.H.: Watkins and Peter Dayan. Q-learning. Mach. Learn. 8(1992), 279–292 (1992)
Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning. ArXiv (2013)
Google Scholar
Carlini, N., et al.: On evaluating adversarial robustness. CoRR, abs/1902.06705 (2019)
Google Scholar
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 3146–3154. Curran Associates Inc. (2017)
Google Scholar
Quarkslab: LIEF: library to instrument executable formats. QuarksLab (2020)
Google Scholar
Saxe, J., Berlin, K.: Deep neural network based malware detection using two dimensional binary program features. ArXiv (2015)
Google Scholar
Oberhumer, M.F.X.J., Molnár, L., Reiser, J.F.: UPX: the ultimate packer for executables - homepage. GitHub (2020)
Google Scholar
Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 3215–3222 (2018)
Google Scholar
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. ArXiv, 21 July 2017
Google Scholar
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ArXiv (2014)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. ArXiv (2015)
Google Scholar
Fortunato, M., et al.: Noisy networks for exploration. In: Proceedings of the International Conference on Representation Learning (ICLR 2018), Vancouver, Canada (2018)
Google Scholar
VirusShare. VirusShare: a repository of malware samples for security researchers (2021). https://virusshare.com. Accessed 12 Mar 2021
Hex-Rays. IDA Pro: A powerful disassembler and a versatile debugger (2021). https://www.hex-rays.com/products/ida/. Accessed 29 Mar 2021

Download references

Author information

Authors and Affiliations

Research Institute CODE, 81739, Munich, Germany
Raphael Labaca-Castro & Gabi Dreo Rodosek
Universität der Bundeswehr München, 85577, Neubiberg, Germany
Raphael Labaca-Castro & Gabi Dreo Rodosek
Technische Universität München, 85748, Munich, Germany
Sebastian Franz

Authors

Raphael Labaca-Castro
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Franz
View author publications
You can also search for this author in PubMed Google Scholar
Gabi Dreo Rodosek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raphael Labaca-Castro .

Editor information

Editors and Affiliations

Facebook AI, Seattle, WA, USA
Yuxiao Dong
Torre Telefonica, Barcelona, Spain
Nicolas Kourtellis
Bielefeld University, CITEC, Bielefeld, Germany
Barbara Hammer
Basque Center for Applied Mathematics, Bilbao, Spain
Jose A. Lozano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Labaca-Castro, R., Franz, S., Rodosek, G.D. (2021). AIMED-RL: Exploring Adversarial Malware Examples with Reinforcement Learning. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-86514-6_3
Published: 10 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86513-9
Online ISBN: 978-3-030-86514-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)