Abstract
Malware designers have become increasingly sophisticated over time, crafting polymorphic and metamorphic malware employing obfuscation tricks such as packing and encryption to evade signature-based malware detection systems. Therefore, security professionals use machine learning-based systems to toughen their defenses – based on malware’s dynamic behavioral features. However, these systems are susceptible to adversarial inputs. Some malware designers exploit this vulnerability to bypass detection. In this work, we develop two approaches to evade machine learning-based classifiers. First, we create a Generative Adversarial Networks (GAN) based method, which we call ‘Malware Evasion using GAN’ (MEGAN) and the extended version ‘Malware Evasion using GAN with Reduced Perturbation (MEGAN-RP).’ Second, we develop a novel reinforcement learning-based approach called ‘Malware Evasion using Reinforcement Agent (MERA).’ We generate adversarial malware that simultaneously minimizes the recall of a target classifier and the amount of perturbation needed in the actual malware to evade detection. We evaluate our work against 13 different BlackBox detection models – all of which use dynamic presence-absence of API calls as features. We observe that our approaches reduce the recall of almost all BlackBox models to zero. Further, MERA outperforms all the other models and reduces True Positive Rate (TPR) to zero against all target models except the Decision Tree (DT) – with minimum perturbation in 6 out of 13 target models. We also present experimental results on adversarial retraining defense and its evasion for GAN based strategies.
Partially Supported by SERB, Government of India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, H.S., Kharkar, A., Filar, B., Evans, D., Roth, P.: Learning to evade static PE machine learning malware models via reinforcement learning. CoRR abs/1801.08917 (2018). http://arxiv.org/abs/1801.08917
AVTest: AV-Test Institute Statistics of Malware (2020). https://www.av-test.org/en/statistics/malware/. Accessed 21 May
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
scikit-learn developers (BSD License): Bootstrap Aggregating Decision Tree (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html#sklearn.ensemble.BaggingClassifier
scikit-learn developers (BSD License): Gaussian Naive Bayes (2019). https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
scikit-learn developers (BSD License): Gradient Boosting Classifier (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html
scikit-learn developers (BSD License): Hard Voting (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html
scikit-learn developers (BSD License): Linear SVC (2019). https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html
scikit-learn developers (BSD License): RBF SVC (2019). https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html
Cheng, M., Yi, J., Chen, P.Y., Zhang, H., Hsieh, C.J.: Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples (2018)
Cuckoo: Cuckoo Sandbox Architecture (2020). https://cuckoo.sh/docs. Accessed 10 June
Fadadu, F., Handa, A., Kumar, N., Shukla, S.K.: Evading API call sequence based malware classifiers. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds.) ICICS 2019. LNCS, vol. 11999, pp. 18–33. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41579-2_2
Fang, Z., Wang, J., Li, B., Wu, S., Zhou, Y., Huang, H.: Evading anti-malware engines with deep reinforcement learning. IEEE Access 7, 48867–48879 (2019)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing Adversarial Examples. arXiv stat.ML eprint 1412.6572 (2014)
Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)
Hu, W., Tan, Y.: Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN. arXiv preprint arXiv:1702.05983 (2017)
Keras: Keras Library (2020). https://www.keras.io. Accessed 10 June
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial Machine Learning at Scale. arXiv cs.CV eprint 1611.01236 (2016)
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into Transferable Adversarial Examples and Black-box Attacks. arXiv cs.LG eprint 1611.02770 (2016)
Malwr: Malware Dataset Repository (2019). https://www.malwr.ee. Accessed 27 Dec
Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: ICML, vol. 97, pp. 211–218. Citeseer (1997)
Mnih, V., et al.: Playing Atari with Deep Reinforcement Learning. arXiv cs.LG eprint 1312.5602 (2013)
Mohaisen, A., Alrawi, O.: Unveiling zeus: automated classification of malware samples. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 829–832 (2013)
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal Adversarial Perturbations. arXiv cs.CV eprint 1610.08401 (2016)
Nishida, K.: Extreme Gradient Boosting Classifier (2019). https://blog.exploratory.io/introduction-to-extreme-gradient-boosting-in-exploratory-7bbec554ac7
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597 (2016)
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (2017). https://doi.org/10.1145/3052973.3053009
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Ruck, D.W., Rogers, S.K., Kabrisky, M., Oxley, M.E., Suter, B.W.: The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw. 1(4), 296–298 (1990)
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble Adversarial Training: Attacks and Defenses. arXiv stat.ML eprint 1705.07204 (2017)
Virusshare: Malware Dataset Repository (2020). https://www.virusshare.com. Accessed 5 Jan
Wang, R., et al.: DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices (2020)
Webroot: Webroot Threat Brief (2016). https://webroot-cms-cdn.s3.amazonaws.com/7814/5617/2382/Webroot-2016-Threat-Brief.pdf. Accessed 21 May
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hariom, Handa, A., Kumar, N., Kumar Shukla, S. (2021). Adversaries Strike Hard: Adversarial Attacks Against Malware Classifiers Using Dynamic API Calls as Features. In: Dolev, S., Margalit, O., Pinkas, B., Schwarzmann, A. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2021. Lecture Notes in Computer Science(), vol 12716. Springer, Cham. https://doi.org/10.1007/978-3-030-78086-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-78086-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78085-2
Online ISBN: 978-3-030-78086-9
eBook Packages: Computer ScienceComputer Science (R0)