Abstract
Malicious URL, a.k.a. malicious website, pose a great threat to Web security. In particular, concept drift caused by variants of malicious URL degrades the performance of existing detection methods based on the available attack patterns. In this paper, We conduct an extensive measurement study of the realistic URL and find that the hierarchical semantics feature is suitable for identifying malicious URL. Therefore, we propose URLGAN, a deep neural network model equipped with the hierarchical semantics features, to detect distinguish between malicious and normal URL. Firstly, we embed the entire URL into a hierarchical semantics structure. Secondly, hierarchical semantics features are extracted from the hierarchical semantics structure through BERT. Then, the extracted features are combined with features generated by the generator, similar but slightly different, to enable the condition discriminator to extract the essential difference between normal and malicious URL. Notably, with the features generated by the generator, we enhance the robustness of the system to detect malicious URL variants. Extensive experiments on the public dataset and our data collected from specific targets demonstrate that our method achieves superior performance to other methods and protects specific targets from the susceptibility of malicious URL.
This work is supported by the National Key Research and Development Program of China (Grant No. 2018YFB0804704), and the National Natural Science Foundation of China (Grant No. U1736218).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Survey shows phishing attacks are up and few are spared. https://mypage.webroot.com/rs/557-FSI-195/images/IDG_Report_Increased_Phising_Attacks.pdf. Accessed 11 Feb 2022
Afzal, S., Asim, M., Javed, A.R., Beg, M.O., Baker, T.: URLdeepDetect: a deep learning approach for detecting malicious URLs using semantic vector models. J. Netw. Syst. Manage. 29(3), 1–27 (2021). https://doi.org/10.1007/s10922-021-09587-8
Anand, A., Gorde, K., Moniz, J.R.A., Park, N., Chakraborty, T., Chu, B.T.: Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1168–1177. IEEE (2018)
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Kolari, P., Finin, T., Joshi, A., et al.: SVMs for the blogosphere: blog identification and splog detection. In: AAAI Spring Symposium on Computational Approaches to Analysing Weblogs (2006)
Le, A., Markopoulou, A., Faloutsos, M.: PhishDef: URL names say it all. In: 2011 Proceedings IEEE INFOCOM, pp. 191–195. IEEE (2011)
Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint arXiv:1802.03162 (2018)
Liu, Z., Li, S., Zhang, Y., Yun, X., Cheng, Z.: Efficient malware originated traffic classification by using generative adversarial networks. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7. IEEE (2020)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681–688 (2009)
Mamun, M.S.I., Rathore, M.A., Lashkari, A.H., Stakhanova, N., Ghorbani, A.A.: Detecting malicious URLs using lexical analysis. In: Chen, J., Piuri, V., Su, C., Yung, M. (eds.) NSS 2016. LNCS, vol. 9955, pp. 467–482. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46298-1_30
Marchal, S., François, J., State, R., Engel, T.: PhishStorm: detecting phishing with streaming analytics. IEEE Trans. Netw. Serv. Manage. 11(4), 458–471 (2014)
Patil, D.R., Patil, J.: Survey on malicious web pages detection techniques. Int. J. u- e-Serv. Sci. Technol. 8(5), 195–206 (2015)
Patil, P., Rane, R., Bhalekar, M.: Detecting spam and phishing mails using SVM and obfuscation URL detection algorithm. In: 2017 International Conference on Inventive Systems and Control (ICISC), pp. 1–4. IEEE (2017)
Prakash, P., Kumar, M., Kompella, R.R., Gupta, M.: PhishNet: predictive blacklisting to detect phishing attacks. In: 2010 Proceedings IEEE INFOCOM, pp. 1–5. IEEE (2010)
Sahoo, D., Liu, C., Hoi, S.C.: Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179 (2017)
Yun, X., Huang, J., Wang, Y., Zang, T., Zhou, Y., Zhang, Y.: Khaos: an adversarial neural network DGA with high anti-detection ability. IEEE Trans. Inf. Forensics Secur. 15, 2225–2240 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Geng, J., Li, S., Liu, Z., Cheng, Z., Fan, L. (2022). Effective Malicious URL Detection by Using Generative Adversarial Networks. In: Di Noia, T., Ko, IY., Schedl, M., Ardito, C. (eds) Web Engineering. ICWE 2022. Lecture Notes in Computer Science, vol 13362. Springer, Cham. https://doi.org/10.1007/978-3-031-09917-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-09917-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09916-8
Online ISBN: 978-3-031-09917-5
eBook Packages: Computer ScienceComputer Science (R0)