Abstract
Dynamic web pages are widely used in web applications to provide better user experience. Meanwhile, web applications have become a primary target in cybercriminals by injecting malware, especially JavaScript, to perform malicious activities through impersonation. Thus, in order to protect users from attacks, it is necessary to detect those malicious codes before they are executed. Since the types of malicious codes increase quickly, it is difficult for the traditional static and dynamic approaches to detect new style of malicious code. In recent years, machine learning has been used in malicious code identification approaches. However, a large number of labeled samples are required to achieve good performance, which is difficult to acquire. This paper proposes an efficient method for improving the classifiers’ recognition rate in detecting malicious JavaScript based on Generative Adversarial Networks (GAN). The output from the GAN is used to train classifiers. Experimental results show that our method can achieve better accuracy with a limited set of labeled sample.
The work described in this paper is supported by the National Natural Science Foundation of China under Grant No. 61702029, No. 61872026 and No. 61672085.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The API features used in this paper is listed at https://github.com/shi13san/proxy/blob/master/Features-javascript%20apis.pdf.
- 2.
We share at https://github.com/shi13san/proxy.
References
Symantec 2019 Internet Security Threat Report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf
Alexa Top Websites. https://www.alexa.com/topsites
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks (2017)
Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a Laplacian pyramid of adversarial networks, pp. 1486–1494 (2015)
Fang, Y., Huang, C., Liu, L., Xue, M.: Research on malicious JavaScript detection technology based on LSTM. IEEE Access 6, 59118–59125 (2018)
Fass, A., Backes, M., Stock, B.: HideNoSeek: camouflaging malicious JavaScript in benign ASTs. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1899–1913 (2019)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
He, X., Xu, L., Cha, C.: Malicious JavaScript code detection based on hybrid analysis. In: 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 365–374. IEEE (2018)
He, Z., Liu, H., Wang, Y., Hu, J.: Generative adversarial networks-based semi-supervised learning for hyperspectral image classification. Remote Sens. 9(10), 1042 (2017)
Jim, T., Swamy, N., Hicks, M.: Defeating script injection attacks with browser-enforced embedded policies. In: International Conference on World Wide Web. WWW 2007, Banff, Alberta, Canada, pp. 601–610, May 2007
Khan, N., Abdullah, J., Khan, A.S.: Defending malicious script attacks using machine learning classifiers Wirel. Commun. Mob. Comput. 2017(2017), 1–9 (2017). https://doi.org/10.1155/2017/5360472. Article ID 5360472
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Leita, C., Cova, M.: HARMUR: storing and analyzing historic data on malicious domains. In: The Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 46–53 (2011)
Nunan, A.E., Souto, E., Santos, E.M.D., Feitosa, E.: Automatic classification of cross-site scripting in web pages using document-based and URL-based features. In: IEEE Symposium on Computers and Communications, pp. 702–707 (2012)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Singh, A., Goyal, N.: A comparison of machine learning attributes for detecting malicious websites. In: 2019 11th International Conference on Communication Systems & Networks (COMSNETS), pp. 352–358. IEEE (2019)
Wang, S., Gao, H., Zhu, Y., Zhang, W., Chen, Y.: A food dish image generation framework based on progressive growing GANs. In: Wang, X., Gao, H., Iqbal, M., Min, G. (eds.) CollaborateCom 2019. LNICST, vol. 292, pp. 323–333. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30146-0_22
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, J., Cao, Q., Zhao, R., Li, Z. (2020). Improving Detection Accuracy for Malicious JavaScript Using GAN. In: Bielikova, M., Mikkonen, T., Pautasso, C. (eds) Web Engineering. ICWE 2020. Lecture Notes in Computer Science(), vol 12128. Springer, Cham. https://doi.org/10.1007/978-3-030-50578-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-50578-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50577-6
Online ISBN: 978-3-030-50578-3
eBook Packages: Computer ScienceComputer Science (R0)