Skip to main content

Deep Neural Network Based Phishing Classification on a High-Risk URL Dataset

  • Conference paper
  • First Online:
Proceedings of the 12th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2020) (SoCPaR 2020)

Abstract

Due to the growing trend of Internetization, the number of connected computers has been increasing day by day. Almost all companies are transferring their main operations from the real world to the cyberworld. Although this increases the marketplace of the firms, it also brings lots of vulnerabilities, such as cyber-attacks, especially with the anonymous structure of the Internet. Phishing is one of the popular attack types which exploits the vulnerabilities to user unawareness. There are some works in the literature that gets help from the rule-based detection systems as a static preventions mechanism, and machine learning-based systems as dynamic prevention mechanisms. In this work, we implemented a deep neural network (DNN) based phishing detection system by analyzing the URL of the suspicious websites. Although in almost all previous researches the used datasets are collected by different resources in which legitimate and phishing websites are clear, in this research, we firstly create a high-risk dataset, which contains only the suspicious websites which are reported to PhishTank website. Experimental research showed that the proposed system gives a very good level of efficiency both in accuracy and execution time manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Proofpoint Report. https://www.proofpoint.com/au/resources/threat-reports/state-of-phish Accessed 31 Oct 2020

  2. APWG 2020 2nd Quarter Report. https://docs.apwg.org/reports/apwg_trends_report_q2_2020.pdf. Accessed 31 Oct 2020

  3. High Risk URL Dataset. www.kaggle.com/dataset/a38e161d237f30579f2e90d8da6bb7dc50717eff63642abdd7adfda0d02218b9 (2020)

  4. Yang, P., Zhao, G., Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access https://doi.org/10.1109/access.2019.2892066 (2019)

  5. Yi, P., Guan, Y., Zou, F., Yao, Y., Wang, W., ZhuWeb, T.: Phishing Detection Using a Deep Learning Framework, Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 4678746, (2018)

    Google Scholar 

  6. Le, H., Pham, Q., Sahoo, D., Hoi, S.: URLNet: learning a URL representation with deep learning for malicious url detection. In: Proceedings of ACM Conference, Washington, DC, USA, July 2017 (Conference 2017) (2018)

    Google Scholar 

  7. Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: URL2Vec: URL modeling with character embeddings for fast and accurate phishing website detection. In: IEEE (2018)

    Google Scholar 

  8. Trevisan, M., Drago, I.: Robust URL Classification with Generative Adversarial Networks, Workshop on AI in Networks (WAIN) 2018, Toulouse (2018)

    Google Scholar 

  9. Anand, A., Gorde, K., Moniz, A., Park, N., Chakraborty, T., Chu, B.: Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE International Conference on Big Data (2018)

    Google Scholar 

  10. Buber E., Demir O., Sahingoz O.K., Feature selections for the machine learning based detection of phishing websites. In: International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, pp. 1–5 (2017)

    Google Scholar 

  11. Buber, E., Diri, B., Sahingoz, O.K.: Detecting phishing attacks from URL by using NLP techniques. In: 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, pp. 337–342 (2017)

    Google Scholar 

  12. Pham, T.T.T., Hoang, V.N., Ha, T.C.: Exploring efficiency of character-level convolution neuron network and long short term memory on malicious URL detection. In: Proceedings of the 2018 VII International Conference on Network, Communication and Computing (ICNCC 2018), ACM, New York, pp. 82–86 (2018)

    Google Scholar 

  13. Shivangi, S., Debnath, P., Sajeevan, K.: Chrome extension for malicious URLs detection in social media applications using artificial neural networks and long short term memory networks. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, pp. 1993–1997 (2018)

    Google Scholar 

  14. Korkmaz, M., Sahingoz, O.K., Diri, B.: Feature selections for the classification of webpages to detect phishing attacks: a survey. In: 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara (2020)

    Google Scholar 

  15. Korkmaz, M., Sahingoz, O.K., Diri B.: Detection of phishing websites by using machine learning-based URL analysis. In: 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), India, pp. 1–7 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehmet Korkmaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Korkmaz, M., Kocyigit, E., Sahingoz, O.K., Diri, B. (2021). Deep Neural Network Based Phishing Classification on a High-Risk URL Dataset. In: Abraham, A., et al. Proceedings of the 12th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2020). SoCPaR 2020. Advances in Intelligent Systems and Computing, vol 1383. Springer, Cham. https://doi.org/10.1007/978-3-030-73689-7_62

Download citation

Publish with us

Policies and ethics