Evading PDF Malware Classifiers with Generative Adversarial Network

Wang, Yaxiao; Li, Yuanzhang; Zhang, Quanxin; Hu, Jingjing; Kuang, Xiaohui

doi:10.1007/978-3-030-37337-5_30

Evading PDF Malware Classifiers with Generative Adversarial Network

Yaxiao Wang ORCID: orcid.org/0000-0002-7799-4715¹¹,
Yuanzhang Li¹¹,
Quanxin Zhang¹¹,
Jingjing Hu¹¹ &
…
Xiaohui Kuang¹²

Conference paper
First Online: 03 January 2020

1157 Accesses
1 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11982))

Abstract

Generative adversarial networks (GANs) have become one of the most popular research topics in deep learning. It is widely used in the term of image, and through the constant competition between generator and discriminator, it can generate so remarkably realistic images that human can’t distinguish. However, Although GAN has achieved great success in generating images, it is still in its infancy in generating adversarial malware examples. In this paper, we propose an PDF malware evasion method that is using GAN to generate adversarial PDF malware examples and evaluate it against four local machine learning based PDF malware classifiers. The evaluation is conducted on the same dataset which contains 100 malicious PDF files. The experimental results reveal that the proposed evasion attacks are effective, with attacks against three classifiers all attaining 100% evasion rate and attack against the last classifier also attaining 95% evasion rate on the evaluation dataset.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Symantec: Internet Security Threat Report, vol. 23 (2018)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: CVPR (2014)
Google Scholar
Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: NIPS (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: ICCV (2015)
Google Scholar
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484 (2016)
Article Google Scholar
Maiorca, D., Ariu, D., Corona, I., Giacinto, G.: An evasion resilient approach to the detection of malicious PDF files. In: Camp, O., Weippl, E., Bidan, C., Aïmeur, E. (eds.) ICISSP 2015. CCIS, vol. 576, pp. 68–85. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27668-7_5
Chapter Google Scholar
Maiorca, D., Ariu, D., Corona, I., et al.: A structural and content-based approach for a precise and robust detection of malicious PDF files. In: 1st International Conference on Information Systems Security and Privacy (ICISSP 2015). IEEE (2015)
Google Scholar
Smutz, C., Stavrou, A.: When a tree falls: using diversity in ensemble classifiers to identify evasion in malware detectors. In: 23rd Annual Network and Distributed System Security Symposium, NDSS 2016, San Diego, California, USA, 21–24 February 2016
Google Scholar
Šrndić, N., Laskov, P.: Hidost: a static machine-learning-based detector of malicious files. EURASIP J. Inf. Secur. 2016(1), 22 (2016)
Article Google Scholar
Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recognit. (2017)
Google Scholar
Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25
Chapter Google Scholar
Biggio, B., et al.: Security evaluation of support vector machines in adversarial environments. In: Ma, Y., Guo, G. (eds.) Support Vector Machines Applications, pp. 105–153. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-02300-7_4
Chapter Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. Comput. Sci. (2014)
Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., et al.: Intriguing properties of neural networks. Comput. Sci. (2013)
Google Scholar
Demontis, A., Melis, M., Biggio, B., et al.: Yes, machine learning can be more secure! A case study on android malware detection. IEEE Trans. Dependable Secur. Comput., 1 (2017)
Google Scholar
Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial Examples for Malware Detection. In: Foley, Simon N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10493, pp. 62–79. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66399-9_4
Chapter Google Scholar
Kolosnjaji, B., Demontis, A., Biggio, B., et al.: Adversarial malware binaries: evading deep learning for malware detection in executables (2018)
Google Scholar
Wang, Q., et al.: In KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. Part F129685, pp. 1145–1153. Association for Computing Machinery (2017)
Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: International Conference on Neural Information Processing Systems (2014)
Google Scholar
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN (2017)
Google Scholar
Šrndić, N., Laskov, P.: Practical evasion of a learning-based classifier: a case study. In: IEEE S&P (2014)
Google Scholar
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: ACM Press the 28th Annual Computer Security Applications Conference, Orlando, Florida, 03 December 2012–07 December 2012
Google Scholar
Maiorca, D., Corona, I., Giacinto, G.: Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection. In: ACM SIGSAC Symposium on Information. ACM (2013)
Google Scholar
Corona, I., Maiorca, D., Ariu, D., et al.: Lux0R: detection of malicious PDF-embedded JavaScript code through discriminant analysis of API references. In: Workshop on Artificial Intelligent & Security Workshop. ACM (2014)
Google Scholar
Maiorca, D., Giacinto, G., Corona, I.: A Pattern Recognition System for Malicious PDF Files Detection. In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 510–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_40
Chapter Google Scholar
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers: a case study on PDF malware classifiers. In: NDSS. The Internet Society (2016)
Google Scholar

Download references

Acknowledgment

This work is supported by National Natural Science Foundation of China (No. 61876019 & U1636213).

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
Yaxiao Wang, Yuanzhang Li, Quanxin Zhang & Jingjing Hu
National Key Laboratory of Science and Technology on Information System Security, Beijing, 100081, China
Xiaohui Kuang

Authors

Yaxiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanzhang Li
View author publications
You can also search for this author in PubMed Google Scholar
Quanxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Kuang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaxiao Wang .

Editor information

Editors and Affiliations

Rutgers University, Newark, NJ, USA
Jaideep Vaidya
Beihang University, Beijing, China
Xiao Zhang
Guangzhou University, Guangzhou, China
Jin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Li, Y., Zhang, Q., Hu, J., Kuang, X. (2019). Evading PDF Malware Classifiers with Generative Adversarial Network. In: Vaidya, J., Zhang, X., Li, J. (eds) Cyberspace Safety and Security. CSS 2019. Lecture Notes in Computer Science(), vol 11982. Springer, Cham. https://doi.org/10.1007/978-3-030-37337-5_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-37337-5_30
Published: 03 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37336-8
Online ISBN: 978-3-030-37337-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics