Skip to main content

Effective Malicious URL Detection by Using Generative Adversarial Networks

  • Conference paper
  • First Online:
Web Engineering (ICWE 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13362))

Included in the following conference series:

  • 1295 Accesses

Abstract

Malicious URL, a.k.a. malicious website, pose a great threat to Web security. In particular, concept drift caused by variants of malicious URL degrades the performance of existing detection methods based on the available attack patterns. In this paper, We conduct an extensive measurement study of the realistic URL and find that the hierarchical semantics feature is suitable for identifying malicious URL. Therefore, we propose URLGAN, a deep neural network model equipped with the hierarchical semantics features, to detect distinguish between malicious and normal URL. Firstly, we embed the entire URL into a hierarchical semantics structure. Secondly, hierarchical semantics features are extracted from the hierarchical semantics structure through BERT. Then, the extracted features are combined with features generated by the generator, similar but slightly different, to enable the condition discriminator to extract the essential difference between normal and malicious URL. Notably, with the features generated by the generator, we enhance the robustness of the system to detect malicious URL variants. Extensive experiments on the public dataset and our data collected from specific targets demonstrate that our method achieves superior performance to other methods and protects specific targets from the susceptibility of malicious URL.

This work is supported by the National Key Research and Development Program of China (Grant No. 2018YFB0804704), and the National Natural Science Foundation of China (Grant No. U1736218).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Survey shows phishing attacks are up and few are spared. https://mypage.webroot.com/rs/557-FSI-195/images/IDG_Report_Increased_Phising_Attacks.pdf. Accessed 11 Feb 2022

  2. Afzal, S., Asim, M., Javed, A.R., Beg, M.O., Baker, T.: URLdeepDetect: a deep learning approach for detecting malicious URLs using semantic vector models. J. Netw. Syst. Manage. 29(3), 1–27 (2021). https://doi.org/10.1007/s10922-021-09587-8

    Article  Google Scholar 

  3. Anand, A., Gorde, K., Moniz, J.R.A., Park, N., Chakraborty, T., Chu, B.T.: Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1168–1177. IEEE (2018)

    Google Scholar 

  4. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)

    Article  Google Scholar 

  5. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  6. Kolari, P., Finin, T., Joshi, A., et al.: SVMs for the blogosphere: blog identification and splog detection. In: AAAI Spring Symposium on Computational Approaches to Analysing Weblogs (2006)

    Google Scholar 

  7. Le, A., Markopoulou, A., Faloutsos, M.: PhishDef: URL names say it all. In: 2011 Proceedings IEEE INFOCOM, pp. 191–195. IEEE (2011)

    Google Scholar 

  8. Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint arXiv:1802.03162 (2018)

  9. Liu, Z., Li, S., Zhang, Y., Yun, X., Cheng, Z.: Efficient malware originated traffic classification by using generative adversarial networks. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7. IEEE (2020)

    Google Scholar 

  10. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681–688 (2009)

    Google Scholar 

  11. Mamun, M.S.I., Rathore, M.A., Lashkari, A.H., Stakhanova, N., Ghorbani, A.A.: Detecting malicious URLs using lexical analysis. In: Chen, J., Piuri, V., Su, C., Yung, M. (eds.) NSS 2016. LNCS, vol. 9955, pp. 467–482. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46298-1_30

    Chapter  Google Scholar 

  12. Marchal, S., François, J., State, R., Engel, T.: PhishStorm: detecting phishing with streaming analytics. IEEE Trans. Netw. Serv. Manage. 11(4), 458–471 (2014)

    Article  Google Scholar 

  13. Patil, D.R., Patil, J.: Survey on malicious web pages detection techniques. Int. J. u- e-Serv. Sci. Technol. 8(5), 195–206 (2015)

    Article  MathSciNet  Google Scholar 

  14. Patil, P., Rane, R., Bhalekar, M.: Detecting spam and phishing mails using SVM and obfuscation URL detection algorithm. In: 2017 International Conference on Inventive Systems and Control (ICISC), pp. 1–4. IEEE (2017)

    Google Scholar 

  15. Prakash, P., Kumar, M., Kompella, R.R., Gupta, M.: PhishNet: predictive blacklisting to detect phishing attacks. In: 2010 Proceedings IEEE INFOCOM, pp. 1–5. IEEE (2010)

    Google Scholar 

  16. Sahoo, D., Liu, C., Hoi, S.C.: Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179 (2017)

  17. Yun, X., Huang, J., Wang, Y., Zang, T., Zhou, Y., Zhang, Y.: Khaos: an adversarial neural network DGA with high anti-detection ability. IEEE Trans. Inf. Forensics Secur. 15, 2225–2240 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuhao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Geng, J., Li, S., Liu, Z., Cheng, Z., Fan, L. (2022). Effective Malicious URL Detection by Using Generative Adversarial Networks. In: Di Noia, T., Ko, IY., Schedl, M., Ardito, C. (eds) Web Engineering. ICWE 2022. Lecture Notes in Computer Science, vol 13362. Springer, Cham. https://doi.org/10.1007/978-3-031-09917-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09917-5_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09916-8

  • Online ISBN: 978-3-031-09917-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics