ABSTRACT
With the widespread application of machine learning techniques in malware detection, researchers have proposed various adversarial attack methods to generate adversarial examples (AEs) of malware, thereby evading detection. Previous studies have shown that the reinforcement learning (RL) framework can enable black-box attacks by performing a sequence of function-preserving operations, which produces functional evasive malware samples. However, it is difficult to obtain the useful guidance and feedbacks from the environment for agent training in the black-box scenario, which results in the RL framework being unable to learn the effective evasion policy. In this paper, we propose the Shapley prior and establish a prior-guidance-based RL framework, namely PSP-Mal, to generate AEs against Portable Executable (PE) malware detectors. Our framework improves on existing methods in three aspects: 1) We explore feature effects of the black-box model by computing Shapley values and further propose the Shapley prior to represent the expected impact of operations. 2) A novel prioritized experience utilization mechanism is established regarding the Shapley prior guidance in the RL framework. 3) The actions are expanded into item-content pairs and we use the Thompson sampling to choose effective content, which helps to reduce randomness and ensure repeatability. We compare the attack performance of our framework with other methods, and experimental results demonstrate that our algorithm is more effective. The evasion rates of PSP-Mal against the LightGBM models trained on EMBER and SOREL-20M reach 76.88% and 72.03%, respectively.
- Naveed Akhtar and Ajmal Mian. 2018. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6 (2018), 14410–14430.Google ScholarCross Ref
- Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. 2017. Evading machine learning malware detection. Black Hat 2017 (2017).Google Scholar
- Hyrum S Anderson and Phil Roth. 2018. Ember: an open dataset for training static pe malware machine learning models. arXiv preprint arXiv:1804.04637 (2018).Google Scholar
- Zahra Bazrafshan, Hashem Hashemi, Seyed Mehdi Hazrati Fard, and Ali Hamzeh. 2013. A survey on heuristic malware detection techniques. In the 5th Conference on Information and Knowledge Technology. IEEE, 113–120.Google ScholarCross Ref
- Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW). IEEE, 1–7.Google ScholarCross Ref
- Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. Advances in Neural Information Processing Systems 24 (2011).Google Scholar
- Bingcai Chen, Zhongru Ren, Chao Yu, Iftikhar Hussain, and Jintao Liu. 2019. Adversarial examples for cnn-based malware detectors. IEEE Access 7 (2019), 54360–54371.Google ScholarCross Ref
- Jun Chen, Jingfei Jiang, Rongchun Li, and Yong Dou. 2020. Generating adversarial examples for static PE malware detector based on deep reinforcement learning. In Journal of Physics: Conference Series, Vol. 1575. IOP Publishing, 012011.Google Scholar
- George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. 2013. Large-scale malware classification using random projections and neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3422–3426.Google ScholarCross Ref
- Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2019. Explaining vulnerabilities of deep learning to adversarial malware binaries. Italian Conference on Cybersecurity (2019).Google Scholar
- Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2021. Functionality-preserving black-box optimization of adversarial windows malware. IEEE Transactions on Information Forensics and Security 16 (2021), 3469–3478.Google ScholarCross Ref
- Luca Demetrio, Scott E Coull, Battista Biggio, Giovanni Lagorio, Alessandro Armando, and Fabio Roli. 2021. Adversarial exemples: a survey and experimental evaluation of practical attacks on machine learning for windows malware detection. ACM Transactions on Privacy and Security 24, 4 (2021), 1–31.Google ScholarDigital Library
- Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. 2020. Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 357–369.Google ScholarDigital Library
- Mohammadreza Ebrahimi, Jason Pacheco, Weifeng Li, James Lee Hu, and Hsinchun Chen. 2021. Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces. In IEEE Security and Privacy Workshops (SPW). IEEE, 85–91.Google Scholar
- Yong Fang, Yuetian Zeng, Beibei Li, Liang Liu, and Lei Zhang. 2020. DeepDetectNet vs RLAttackNet: An adversarial method to improve deep learning-based static malware detection model. Plos One 15, 4 (2020), e0231626.Google ScholarCross Ref
- Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, Yingjie Zhou, and Xuan Kan. 2021. A3CMal: Generating adversarial samples to force targeted misclassification by reinforcement learning. Applied Soft Computing 109 (2021), 107505.Google ScholarDigital Library
- Zhiyang Fang, Junfeng Wang, Boya Li, Siqi Wu, Yingjie Zhou, and Haiying Huang. 2019. Evading anti-malware engines with deep reinforcement learning. IEEE Access 7 (2019), 48867–48879.Google ScholarCross Ref
- Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics (2001), 1189–1232.Google Scholar
- Daniel Gibert, Matt Fredrikson, Carles Mateu, Jordi Planes, and Quan Le. 2022. Enhancing the insertion of NOP instructions to obfuscate malware via deep reinforcement learning. Computers & Security 113 (2022), 102543.Google ScholarDigital Library
- Daniel Gibert, Carles Mateu, Jordi Planes, and Ramon Vicens. 2018. Classification of malware by using structural entropy on convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).Google Scholar
- Richard Harang and Ethan M Rudd. 2020. SOREL-20M: A large scale benchmark dataset for malicious PE detection. arXiv preprint arXiv:2012.07634 (2020).Google Scholar
- Weiwei Hu and Ying Tan. 2017. Black-box attacks against RNN based malware detection algorithms. arXiv preprint arXiv:1705.08131 (2017).Google Scholar
- Masataka Kawai, Kaoru Ota, and Mianxing Dong. 2019. Improved malgan: Avoiding malware detector by leaning cleanware features. In the International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 040–045.Google ScholarCross Ref
- Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (2017).Google ScholarDigital Library
- Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, and Aziz Mohaisen. 2019. COPYCAT: practical adversarial attacks on visualization-based malware detection. arXiv preprint arXiv:1909.09735 (2019).Google Scholar
- Bojan Kolosnjaji, Ambra Demontis, Battista Biggio, and Maiorca. 2018. Adversarial malware binaries: Evading deep learning for malware detection in executables. In the 26th European Signal Processing Conference (EUSIPCO). IEEE, 533–537.Google ScholarCross Ref
- Jeremy Z Kolter and Marcus A Maloof. 2004. Learning to detect malicious executables in the wild. In the 10th ACM International Conference on Knowledge Discovery and Data Mining. 470–478.Google ScholarDigital Library
- Felix Kreuk, Assi Barak, and Aviv-Reuven. 2018. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528 (2018).Google Scholar
- Raphael Labaca-Castro, Sebastian Franz, and Gabi Dreo Rodosek. 2021. AIMED-RL: Exploring adversarial malware examples with reinforcement learning. In Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2021. Springer, 37–52.Google ScholarDigital Library
- Xintong Li and Qi Li. 2021. An IRL-based malware adversarial generation method to evade anti-malware engines. Computers & Security 104 (2021), 102118.Google ScholarDigital Library
- Scott M Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence 2, 1 (2020), 56–67.Google Scholar
- Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017).Google Scholar
- Christoph Molnar. 2020. Interpretable machine learning. Lulu. com.Google Scholar
- Andrew Y Ng, Stuart Russell, 2000. Algorithms for inverse reinforcement learning.. In International Conference on Machine Learning, Vol. 1. 2.Google Scholar
- Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. 2020. Intriguing properties of adversarial ml attacks in the problem space. In IEEE Symposium on Security and Privacy (SP). IEEE, 1332–1349.Google ScholarCross Ref
- Tony Quertier, Benjamin Marais, Stéphane Morucci, and Bertrand Fournel. 2022. MERLIN–Malware Evasion with Reinforcement LearnINg. arXiv preprint arXiv:2203.12980 (2022).Google Scholar
- Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles K Nicholas. 2018. Malware detection by eating a whole exe. In Proceedings of the AAAI Conference on Artificial Intelligence.Google Scholar
- Ishai Rosenberg, Asaf Shabtai, Lior Rokach, and Yuval Elovici. 2018. Generic black-box end-to-end attack against state of the art API call based malware classifiers. In the 21st International Symposium Research in Attacks, Intrusions and Defense. Springer, 490–510.Google ScholarCross Ref
- V Sai Sathyanarayan, Pankaj Kohli, and Bezawada Bruhadeshwar. 2008. Signature generation and detection of malware families. In Information Security and Privacy: 13th Australasian Conference. Springer, 336–349.Google ScholarDigital Library
- Joshua Saxe and Konstantin Berlin. 2015. Deep neural network based malware detection using two dimensional binary program features. In the 10th International Conference on Malicious and Unwanted Software. IEEE, 11–20.Google ScholarDigital Library
- Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).Google Scholar
- Giorgio Severi, Jim Meyer, Scott E Coull, and Alina Oprea. 2021. Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers.. In USENIX Security Symposium. 1487–1504.Google Scholar
- Wei Song, Xuezixiang Li, Sadia Afroz, Deepali Garg, Dmitry Kuznetsov, and Heng Yin. 2020. Mab-malware: A reinforcement learning framework for attacking static malware classifiers. arXiv preprint arXiv:2003.03100 (2020).Google Scholar
- Octavian Suciu, Scott E Coull, and Jeffrey Johns. 2019. Exploring adversarial examples in malware detection. In IEEE Security and Privacy Workshops (SPW). IEEE, 8–14.Google ScholarCross Ref
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
- Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarCross Ref
- Xiruo Wang and Risto Miikkulainen. 2020. MDEA: Malware detection with evolutionary adversarial learning. In IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–8.Google ScholarDigital Library
- Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning. PMLR, 1995–2003.Google Scholar
- Cangshuai Wu, Jiangyong Shi, Yuexiang Yang, and Wenhua Li. 2018. Enhancing machine learning based malware detection model by reinforcement learning. In Proceedings of the 8th International Conference on Communication and Network Security. 74–78.Google ScholarDigital Library
- Di Wu, Binxing Fang, Junnan Wang, Qixu Liu, and Xiang Cui. 2019. Evading machine learning botnet detection models via deep reinforcement learning. In IEEE International Conference on Communications (ICC). IEEE, 1–6.Google ScholarCross Ref
- Ilsun You and Kangbin Yim. 2010. Malware obfuscation techniques: A brief survey. In the International Conference on Broadband, Wireless Computing, Communication and Applications. IEEE, 297–300.Google ScholarDigital Library
- Junkun Yuan, Shaofang Zhou, Lanfen Lin, Feng Wang, and Jia Cui. 2020. Black-box adversarial attacks against deep learning based malware binaries detection with GAN. In the European Conference on Artificial Intelligence. IOS Press, 2536–2542.Google Scholar
- Dazhi Zhan, Yexin Duan, Yue Hu, Lujia Yin, Zhisong Pan, and Shize Guo. 2023. AMGmal: Adaptive mask-guided adversarial attack against malware detection with minimal perturbation. Computers & Security 127 (2023), 103103.Google ScholarDigital Library
- Lan Zhang, Peng Liu, Yoon-Ho Choi, and Ping Chen. 2022. Semantics-preserving reinforcement learning attack against graph neural networks for malware detection. IEEE Transactions on Dependable and Secure Computing 20, 2 (2022), 1390–1402.Google ScholarDigital Library
- Fangtian Zhong, Pengfei Hu, Guoming Zhang, Hong Li, and Xiuzhen Cheng. 2022. Reinforcement learning based adversarial malware example generation against black-box detectors. Computers & Security 121 (2022), 102869.Google ScholarDigital Library
Index Terms
- PSP-Mal: Evading Malware Detection via Prioritized Experience-based Reinforcement Learning with Shapley Prior
Recommendations
Leveraging Reinforcement Learning and Generative Adversarial Networks to Craft Mutants of Windows Malware against Black-box Malware Detectors
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication TechnologyTo build an effective malware detector, it is required to collect a diversity of malware samples and their evolution, since malware authors always try to evade detectors through strategies of malware mutation. So, this paper explores the ability to ...
Enhancing Machine Learning Based Malware Detection Model by Reinforcement Learning
ICCNS '18: Proceedings of the 8th International Conference on Communication and Network SecurityMalware detection is getting more and more attention due to the rapid growth of new malware. As a result, machine learning (ML) has become a popular way to detect malware variants. However, machine learning models can also be cheated. Through ...
Malware detection using adaptive data compression
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISecA popular approach in current commercial anti-malware software detects malicious programs by searching in the code of programs for scan strings that are byte sequences indicative of malicious code. The scan strings, also known as the signatures of ...
Comments