skip to main content
10.1145/3627106.3627178acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article
Results Reproduced / v1.1

PSP-Mal: Evading Malware Detection via Prioritized Experience-based Reinforcement Learning with Shapley Prior

Published:04 December 2023Publication History

ABSTRACT

With the widespread application of machine learning techniques in malware detection, researchers have proposed various adversarial attack methods to generate adversarial examples (AEs) of malware, thereby evading detection. Previous studies have shown that the reinforcement learning (RL) framework can enable black-box attacks by performing a sequence of function-preserving operations, which produces functional evasive malware samples. However, it is difficult to obtain the useful guidance and feedbacks from the environment for agent training in the black-box scenario, which results in the RL framework being unable to learn the effective evasion policy. In this paper, we propose the Shapley prior and establish a prior-guidance-based RL framework, namely PSP-Mal, to generate AEs against Portable Executable (PE) malware detectors. Our framework improves on existing methods in three aspects: 1) We explore feature effects of the black-box model by computing Shapley values and further propose the Shapley prior to represent the expected impact of operations. 2) A novel prioritized experience utilization mechanism is established regarding the Shapley prior guidance in the RL framework. 3) The actions are expanded into item-content pairs and we use the Thompson sampling to choose effective content, which helps to reduce randomness and ensure repeatability. We compare the attack performance of our framework with other methods, and experimental results demonstrate that our algorithm is more effective. The evasion rates of PSP-Mal against the LightGBM models trained on EMBER and SOREL-20M reach 76.88% and 72.03%, respectively.

References

  1. Naveed Akhtar and Ajmal Mian. 2018. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6 (2018), 14410–14430.Google ScholarGoogle ScholarCross RefCross Ref
  2. Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. 2017. Evading machine learning malware detection. Black Hat 2017 (2017).Google ScholarGoogle Scholar
  3. Hyrum S Anderson and Phil Roth. 2018. Ember: an open dataset for training static pe malware machine learning models. arXiv preprint arXiv:1804.04637 (2018).Google ScholarGoogle Scholar
  4. Zahra Bazrafshan, Hashem Hashemi, Seyed Mehdi Hazrati Fard, and Ali Hamzeh. 2013. A survey on heuristic malware detection techniques. In the 5th Conference on Information and Knowledge Technology. IEEE, 113–120.Google ScholarGoogle ScholarCross RefCross Ref
  5. Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW). IEEE, 1–7.Google ScholarGoogle ScholarCross RefCross Ref
  6. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. Advances in Neural Information Processing Systems 24 (2011).Google ScholarGoogle Scholar
  7. Bingcai Chen, Zhongru Ren, Chao Yu, Iftikhar Hussain, and Jintao Liu. 2019. Adversarial examples for cnn-based malware detectors. IEEE Access 7 (2019), 54360–54371.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jun Chen, Jingfei Jiang, Rongchun Li, and Yong Dou. 2020. Generating adversarial examples for static PE malware detector based on deep reinforcement learning. In Journal of Physics: Conference Series, Vol. 1575. IOP Publishing, 012011.Google ScholarGoogle Scholar
  9. George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. 2013. Large-scale malware classification using random projections and neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3422–3426.Google ScholarGoogle ScholarCross RefCross Ref
  10. Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2019. Explaining vulnerabilities of deep learning to adversarial malware binaries. Italian Conference on Cybersecurity (2019).Google ScholarGoogle Scholar
  11. Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2021. Functionality-preserving black-box optimization of adversarial windows malware. IEEE Transactions on Information Forensics and Security 16 (2021), 3469–3478.Google ScholarGoogle ScholarCross RefCross Ref
  12. Luca Demetrio, Scott E Coull, Battista Biggio, Giovanni Lagorio, Alessandro Armando, and Fabio Roli. 2021. Adversarial exemples: a survey and experimental evaluation of practical attacks on machine learning for windows malware detection. ACM Transactions on Privacy and Security 24, 4 (2021), 1–31.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. 2020. Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 357–369.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mohammadreza Ebrahimi, Jason Pacheco, Weifeng Li, James Lee Hu, and Hsinchun Chen. 2021. Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces. In IEEE Security and Privacy Workshops (SPW). IEEE, 85–91.Google ScholarGoogle Scholar
  15. Yong Fang, Yuetian Zeng, Beibei Li, Liang Liu, and Lei Zhang. 2020. DeepDetectNet vs RLAttackNet: An adversarial method to improve deep learning-based static malware detection model. Plos One 15, 4 (2020), e0231626.Google ScholarGoogle ScholarCross RefCross Ref
  16. Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, Yingjie Zhou, and Xuan Kan. 2021. A3CMal: Generating adversarial samples to force targeted misclassification by reinforcement learning. Applied Soft Computing 109 (2021), 107505.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zhiyang Fang, Junfeng Wang, Boya Li, Siqi Wu, Yingjie Zhou, and Haiying Huang. 2019. Evading anti-malware engines with deep reinforcement learning. IEEE Access 7 (2019), 48867–48879.Google ScholarGoogle ScholarCross RefCross Ref
  18. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics (2001), 1189–1232.Google ScholarGoogle Scholar
  19. Daniel Gibert, Matt Fredrikson, Carles Mateu, Jordi Planes, and Quan Le. 2022. Enhancing the insertion of NOP instructions to obfuscate malware via deep reinforcement learning. Computers & Security 113 (2022), 102543.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Daniel Gibert, Carles Mateu, Jordi Planes, and Ramon Vicens. 2018. Classification of malware by using structural entropy on convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).Google ScholarGoogle Scholar
  22. Richard Harang and Ethan M Rudd. 2020. SOREL-20M: A large scale benchmark dataset for malicious PE detection. arXiv preprint arXiv:2012.07634 (2020).Google ScholarGoogle Scholar
  23. Weiwei Hu and Ying Tan. 2017. Black-box attacks against RNN based malware detection algorithms. arXiv preprint arXiv:1705.08131 (2017).Google ScholarGoogle Scholar
  24. Masataka Kawai, Kaoru Ota, and Mianxing Dong. 2019. Improved malgan: Avoiding malware detector by leaning cleanware features. In the International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 040–045.Google ScholarGoogle ScholarCross RefCross Ref
  25. Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, and Aziz Mohaisen. 2019. COPYCAT: practical adversarial attacks on visualization-based malware detection. arXiv preprint arXiv:1909.09735 (2019).Google ScholarGoogle Scholar
  27. Bojan Kolosnjaji, Ambra Demontis, Battista Biggio, and Maiorca. 2018. Adversarial malware binaries: Evading deep learning for malware detection in executables. In the 26th European Signal Processing Conference (EUSIPCO). IEEE, 533–537.Google ScholarGoogle ScholarCross RefCross Ref
  28. Jeremy Z Kolter and Marcus A Maloof. 2004. Learning to detect malicious executables in the wild. In the 10th ACM International Conference on Knowledge Discovery and Data Mining. 470–478.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Felix Kreuk, Assi Barak, and Aviv-Reuven. 2018. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528 (2018).Google ScholarGoogle Scholar
  30. Raphael Labaca-Castro, Sebastian Franz, and Gabi Dreo Rodosek. 2021. AIMED-RL: Exploring adversarial malware examples with reinforcement learning. In Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2021. Springer, 37–52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xintong Li and Qi Li. 2021. An IRL-based malware adversarial generation method to evade anti-malware engines. Computers & Security 104 (2021), 102118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Scott M Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence 2, 1 (2020), 56–67.Google ScholarGoogle Scholar
  33. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017).Google ScholarGoogle Scholar
  34. Christoph Molnar. 2020. Interpretable machine learning. Lulu. com.Google ScholarGoogle Scholar
  35. Andrew Y Ng, Stuart Russell, 2000. Algorithms for inverse reinforcement learning.. In International Conference on Machine Learning, Vol. 1. 2.Google ScholarGoogle Scholar
  36. Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. 2020. Intriguing properties of adversarial ml attacks in the problem space. In IEEE Symposium on Security and Privacy (SP). IEEE, 1332–1349.Google ScholarGoogle ScholarCross RefCross Ref
  37. Tony Quertier, Benjamin Marais, Stéphane Morucci, and Bertrand Fournel. 2022. MERLIN–Malware Evasion with Reinforcement LearnINg. arXiv preprint arXiv:2203.12980 (2022).Google ScholarGoogle Scholar
  38. Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles K Nicholas. 2018. Malware detection by eating a whole exe. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  39. Ishai Rosenberg, Asaf Shabtai, Lior Rokach, and Yuval Elovici. 2018. Generic black-box end-to-end attack against state of the art API call based malware classifiers. In the 21st International Symposium Research in Attacks, Intrusions and Defense. Springer, 490–510.Google ScholarGoogle ScholarCross RefCross Ref
  40. V Sai Sathyanarayan, Pankaj Kohli, and Bezawada Bruhadeshwar. 2008. Signature generation and detection of malware families. In Information Security and Privacy: 13th Australasian Conference. Springer, 336–349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Joshua Saxe and Konstantin Berlin. 2015. Deep neural network based malware detection using two dimensional binary program features. In the 10th International Conference on Malicious and Unwanted Software. IEEE, 11–20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).Google ScholarGoogle Scholar
  43. Giorgio Severi, Jim Meyer, Scott E Coull, and Alina Oprea. 2021. Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers.. In USENIX Security Symposium. 1487–1504.Google ScholarGoogle Scholar
  44. Wei Song, Xuezixiang Li, Sadia Afroz, Deepali Garg, Dmitry Kuznetsov, and Heng Yin. 2020. Mab-malware: A reinforcement learning framework for attacking static malware classifiers. arXiv preprint arXiv:2003.03100 (2020).Google ScholarGoogle Scholar
  45. Octavian Suciu, Scott E Coull, and Jeffrey Johns. 2019. Exploring adversarial examples in malware detection. In IEEE Security and Privacy Workshops (SPW). IEEE, 8–14.Google ScholarGoogle ScholarCross RefCross Ref
  46. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google ScholarGoogle Scholar
  47. Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarGoogle ScholarCross RefCross Ref
  48. Xiruo Wang and Risto Miikkulainen. 2020. MDEA: Malware detection with evolutionary adversarial learning. In IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning. PMLR, 1995–2003.Google ScholarGoogle Scholar
  50. Cangshuai Wu, Jiangyong Shi, Yuexiang Yang, and Wenhua Li. 2018. Enhancing machine learning based malware detection model by reinforcement learning. In Proceedings of the 8th International Conference on Communication and Network Security. 74–78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Di Wu, Binxing Fang, Junnan Wang, Qixu Liu, and Xiang Cui. 2019. Evading machine learning botnet detection models via deep reinforcement learning. In IEEE International Conference on Communications (ICC). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  52. Ilsun You and Kangbin Yim. 2010. Malware obfuscation techniques: A brief survey. In the International Conference on Broadband, Wireless Computing, Communication and Applications. IEEE, 297–300.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Junkun Yuan, Shaofang Zhou, Lanfen Lin, Feng Wang, and Jia Cui. 2020. Black-box adversarial attacks against deep learning based malware binaries detection with GAN. In the European Conference on Artificial Intelligence. IOS Press, 2536–2542.Google ScholarGoogle Scholar
  54. Dazhi Zhan, Yexin Duan, Yue Hu, Lujia Yin, Zhisong Pan, and Shize Guo. 2023. AMGmal: Adaptive mask-guided adversarial attack against malware detection with minimal perturbation. Computers & Security 127 (2023), 103103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Lan Zhang, Peng Liu, Yoon-Ho Choi, and Ping Chen. 2022. Semantics-preserving reinforcement learning attack against graph neural networks for malware detection. IEEE Transactions on Dependable and Secure Computing 20, 2 (2022), 1390–1402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Fangtian Zhong, Pengfei Hu, Guoming Zhang, Hong Li, and Xiuzhen Cheng. 2022. Reinforcement learning based adversarial malware example generation against black-box detectors. Computers & Security 121 (2022), 102869.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PSP-Mal: Evading Malware Detection via Prioritized Experience-based Reinforcement Learning with Shapley Prior

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference
        December 2023
        836 pages
        ISBN:9798400708862
        DOI:10.1145/3627106

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 December 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate104of497submissions,21%
      • Article Metrics

        • Downloads (Last 12 months)83
        • Downloads (Last 6 weeks)12

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format