Skip to main content

A Review of Phishing URL Detection Using Machine Learning Classifiers

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1251))

Included in the following conference series:

Abstract

Phishing is a rapidly increasing threat to the modern internet where phisher mimics a legit web-page to get user into the phishers cage. The aim of phisher is to get sensitive records or information about the user such as credit card details, email and passwords. Anti-Phishing Working Group recently reported that 86,276 unique phishing URLs have been detected in the month of September, 2019. Thus, to resolve this threat, different methods and techniques have been proposed such as blacklist, whitelist, heuristics of URL, content based, image processing as well as Machine Learning based. Machine Learning (ML) is a modern technique and its algorithms have better efficiency, accuracy and performance. Thus, this paper critically reviews and evaluates Machine Learning based classifiers on the basis of datasets used, feature extraction techniques and performance measures used for the detection of phishing URLs. Moreover, the literature review reveals that every ML base approach has its advantages and limitations; hence, suggesting one classifier over other is challenging. However, Random Forest (RF) and Support Vector Machine (SVM) are mainly used classifiers in the literature of this study, where RF achieved the highest accuracy using larger dataset. In contrast, SVM achieved the second highest accuracy using smaller dataset. This research concludes that RF is an efficient approach for the detection of phishing URLs. Finally, insights are provided at the critical evaluation section and this research proposed extracted features that can enhance the literature in order to develop an effective URL phishing detection method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. APWG: Phishing activity trends reports, 1st, 2nd, 3rd, and 4th quarters of each years (2013–2019). https://apwg.org/trendsreports/

  2. Alsharnouby, M., Alaca, F., Chiasson, S.: Why phishing still works: user strategies for combating phishing attacks. Int. J. Hum.-Comput. Stud. 82, 69–82 (2015)

    Article  Google Scholar 

  3. Parekh, S., Parikh, D., Kotak, S., Sankhe, S.: A new method for detection of phishing websites: URL detection. In: IEEE 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT), April 2018

    Google Scholar 

  4. Jagadeesan, S.: URL phishing analysis using random forest. IEEE Int. J. Pure Appl. Math. 118(20), 4159–4163 (2018)

    Google Scholar 

  5. Hutchinson, S., Zhang, Z., Liu, Q.: Detecting phishing websites with random forest. In: ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, MILICOM, vol. 251, pp. 470–479. Springer, October 2018

    Google Scholar 

  6. Subasi, A., Molah, E., Almkallawi, F., Chaudhery, T.J.: Intelligent phishing website detection using random forest classifier. In: IEEE International Conference on Electrical and Computing Technologies and Applications (ICECTA), November 2017

    Google Scholar 

  7. Alswailem, A., Alabdullah, B., Alrumayh, N., Alsedrani, A.: Detecting phishing websites using machine learning. In: IEEE 2nd International Conference on Computer Applications & Information Security (ICCAIS), May 2019

    Google Scholar 

  8. Islam, M., Chowdhury, N.K.: Phishing websites detection using machine learning based classification techniques. In: ICAICT 1st International Conference on Advanced Information and Communication Technology, At Chittagong, Bangladesh, November 2016

    Google Scholar 

  9. Jain, A.K., Gupta, B.B.: PHISH-SAFE: URL features-based phishing detection system using machine learning. In: Cyber Security, Advances in Intelligent Systems and Computing. Springer (2018)

    Google Scholar 

  10. Banik, B., Sarma, A.: Phishing URL detection system based on URL features using SVM. Int. J. Electron. Appl. Res. (IJEAR) 5(2), 40–55 (2018)

    Article  Google Scholar 

  11. Kulkarni, A., Brown, L.L.: Phishing websites detection using machine learning. Int. J. Adv. Comput. Sci. Appl. (IJACSA), 10, 8–13 (2019)

    Google Scholar 

  12. Singh, P., Maravi, Y.P.S., Sharma, S.: Phishing websites detection through supervised learning networks. In: IEEE International Conference on Computing and Communication Technologies (ICCCT) (2015)

    Google Scholar 

  13. Shirazi, H., Bezawada, B., Ray, I.: Know Thy Doma1n Name: unbiased phishing detection using domain name based features. In: Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies (SACMAT), pp. 69–75 (2018)

    Google Scholar 

  14. Weedon, M., Tsaptsinos, D., Denholm-Price, J.: Random forest explorations for URL classification. In: IEEE International Conference on Cyber Situational Awareness, Data Analytics and Assessment (Cyber SA), June 2017

    Google Scholar 

  15. Pandey, A., Gill, N., Nadendla, K.S.P., Thaseen, I.S.: Identification of phishing attack in websites using random forest-SVM hybrid model. In: Intelligent Systems Design and Applications (ISDA), pp. 120–128. Springer, April 2019

    Google Scholar 

  16. Sananse, B.E., Sarode, T.K.: Phishing URL detection: a machine learning and web mining-based approach. Int. J. Comput. Appl. (IJCA) 123(13), 46–50 (2015)

    Google Scholar 

  17. Benavides, E., Fuertes, W., Sanchez, S., Sanchez, M.: Classification of phishing attack solutions by employing deep learning techniques: a systematic literature review. In: Developments and Advances in Defense and Security, pp. 51–64. Springer, June 2019

    Google Scholar 

  18. Jain, A.K., Gupta, B.B.: Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 68, 687–700 (2017)

    Article  Google Scholar 

  19. Jain, A.K., Gupta, B.B.: A machine learning based approach for phishing detection using hyperlinks information. J. Ambient Intell. Hum. Comput. 10, 2015–2028 (2018)

    Article  Google Scholar 

  20. Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31, 3851–3873 (2018)

    Article  Google Scholar 

  21. Rao, R.S., Vaishnavi, T., Pais, A.R.: CatchPhish: detection of phishing websites by inspecting URLs. J. Ambient Intell. Hum. Comput. 11, 813–825 (2019)

    Article  Google Scholar 

  22. Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. J. Expert Syst. Appl. 117, 345–357 (2019)

    Article  Google Scholar 

  23. PhishTank: Verified phishing URLs. https://www.phishtank.com/

  24. UCI: UC Irvine Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php/

  25. Alexa: Most popular legitimate URLs. https://www.alexa.com/

  26. DMOZ: Web Directory. https://dmoz-odp.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sajjad Jalil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jalil, S., Usman, M. (2021). A Review of Phishing URL Detection Using Machine Learning Classifiers. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_47

Download citation

Publish with us

Policies and ethics