Skip to main content

Recent Research on Phishing Detection Through Machine Learning Algorithm

  • Conference paper
  • First Online:
Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices (IEA/AIE 2021)

Abstract

The rapid growth of emerging technologies, smart devices, 5G communication, etc. have contributed to the accumulation of data, hence introducing the big data era. Big data imposes a variety of challenges associated with machine learning, especially in phishing detection. Therefore, this paper aims to provide an analysis and summary of current research in phishing detection through machine learning for big data. To achieve this goal, this study adopted a systematic literature review (SLR) technique and critically analyzed a total of 30 papers from various journals and conference proceedings. These papers were selected from previous studies in five different databases on content published between 2018 and January 2021. The results obtained from this study reveal a limited number of research works that comprehensively reviewed the feasibility of applying both machine learning and big data technologies in the context of phishing detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. von Solms, R., van Niekerk, J.: From information security to cyber security. Comput. Secur. 38, 97–102 (2013). https://doi.org/10.1016/j.cose.2013.04.004

    Article  Google Scholar 

  2. Jang-Jaccard, J., Nepal, S.: A survey of emerging threats in cybersecurity. J. Comput. Syst. Sci. 80(5), 973–993 (2014). https://doi.org/10.1016/j.jcss.2014.02.005

    Article  MathSciNet  MATH  Google Scholar 

  3. Kitchenham, O.B., Brereton, P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering – a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009). https://doi.org/10.1016/j.infsof.2008.09.009

    Article  Google Scholar 

  4. Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., Group, T.P.: Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement PLOS Med. 6(7), e1000097 (2009). https://doi.org/10.1371/journal.pmed.1000097

  5. Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering - EASE 2014, London, England, United Kingdom, pp. 1–10 (2014). https://doi.org/10.1145/2601248.2601268

  6. Zhou, X., Jin, Y., Zhang, H., Li, S., Huang, X.: A map of threats to validity of systematic literature reviews in software engineering. In: 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), pp. 153–160, December 2016. https://doi.org/10.1109/APSEC.2016.031

  7. Orabi, M., Mouheb, D., Al Aghbari, Z., Kamel, I.: Detection of bots in social media: a systematic review. Inf. Proc. Manage. 57(4), p. 102250 (2020). https://doi.org/10.1016/j.ipm.2020.102250

  8. Lim, K.C., Selamat, A., Alias, R.A., Krejcar, O., Fujita, H.: Usability measures in mobile-based augmented reality learning applications: a systematic review. Appl. Sci. 9(13), Art. no. 13, (2019). https://doi.org/10.3390/app9132718

  9. Qabajeh, I., Thabtah, F., Chiclana, F.: A recent review of conventional vs. automated cybersecurity anti-phishing techniques. Comput. Sci. Rev. 29, 44–55 (2018). https://doi.org/10.1016/j.cosrev.2018.05.003

    Article  Google Scholar 

  10. Amanullah, M.A., et al.: Deep learning and big data technologies for IoT security. Comput. Commun. 151, 495–517 (2020). https://doi.org/10.1016/j.comcom.2020.01.016

    Article  Google Scholar 

  11. Zhu, E., Ju, Y., Chen, Z., Liu, F., Fang, X.: DTOF-ANN: an artificial neural network phishing detection model based on decision tree and optimal features. Appl. Soft Comput. 95, 106505 (2020). https://doi.org/10.1016/j.asoc.2020.106505

    Article  Google Scholar 

  12. Tan, C.L., Chiew, K.L., Yong, K.S.C., Sze, S.N., Abdullah, J., Sebastian, Y.: A graph-theoretic approach for the detection of phishing webpages. Comput. Secur. 95, 101793 (2020). https://doi.org/10.1016/j.cose.2020.101793

    Article  Google Scholar 

  13. Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: A Survey. Int. J. Inf. Manage. 45, 289–307 (2019). https://doi.org/10.1016/j.ijinfomgt.2018.08.006

    Article  Google Scholar 

  14. Dixit, P., Silakari, S.: Deep learning algorithms for cybersecurity applications: a technological and status review. Comput. Sci. Rev. 39, 100317 (2021). https://doi.org/10.1016/j.cosrev.2020.100317

    Article  MathSciNet  Google Scholar 

  15. Mahdavifar, S., Ghorbani, A.A.: Application of deep learning to cybersecurity: a survey. Neurocomputing 347, 149–176 (2019). https://doi.org/10.1016/j.neucom.2019.02.056

    Article  Google Scholar 

  16. Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31(8), 3851–3873 (2018). https://doi.org/10.1007/s00521-017-3305-0

    Article  Google Scholar 

  17. Hota, H.S., Shrivas, A.K., Hota, R.: An Ensemble model for detecting phishing attack with proposed remove-replace feature selection technique. Procedia Comput. Sci. 132, 900–907 (2018). https://doi.org/10.1016/j.procs.2018.05.103

    Article  Google Scholar 

  18. Subasi, A., Kremic, E.: Comparison of adaboost with multiboosting for phishing website detection. Procedia Comput. Sci. 168, 272–278 (2020). https://doi.org/10.1016/j.procs.2020.02.251

    Article  Google Scholar 

  19. Janjua, F., Masood, A., Abbas, H., Rashid, I.: Handling Insider Threat Through Supervised Machine Learning Techniques. Procedia Computer Science 177, 64–71 (2020). https://doi.org/10.1016/j.procs.2020.10.012

    Article  Google Scholar 

  20. Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019). https://doi.org/10.1016/j.eswa.2018.09.029

    Article  Google Scholar 

  21. Adebowale, M.A., Lwin, K.T., Sánchez, E., Hossain, M.A.: Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text. Expert Syst. Appl. 115, 300–313 (2019). https://doi.org/10.1016/j.eswa.2018.07.067

    Article  Google Scholar 

  22. Mahdavifar, S., Ghorbani, A.A.: DeNNeS: deep embedded neural network expert system for detecting cyber attacks. Neural Comput. Appl. 32(18), 14753–14780 (2020). https://doi.org/10.1007/s00521-020-04830-w

    Article  Google Scholar 

  23. Zhu, H.: Online meta-learning firewall to prevent phishing attacks. Neural Comput. Appl. 32(23), 17137–17147 (2020). https://doi.org/10.1007/s00521-020-05041-z

    Article  Google Scholar 

  24. Zhu, E., Chen, Y., Ye, C., Li, X., Liu, F.: OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7, 73271–73284 (2019). https://doi.org/10.1109/ACCESS.2019.2920655

    Article  Google Scholar 

  25. Orunsolu, A.A., Sodiya, A.S., Akinwale, A.T.: A predictive model for phishing detection. J. King Saud Univ. – Comput. Inf. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.12.005

    Article  Google Scholar 

  26. Ding, Y., Luktarhan, N., Li, K., Slamu, W.: A keyword-based combination approach for detecting phishing webpages. Comput. Secur. 84, 256–275 (2019). https://doi.org/10.1016/j.cose.2019.03.018

    Article  Google Scholar 

  27. Liew, S.W., Sani, N.F.M., Abdullah, M.T., Yaakob, R., Sharum, M.Y.: An effective security alert mechanism for real-time phishing tweet detection on Twitter. Comput. Secur. 83, 201–207 (2019). https://doi.org/10.1016/j.cose.2019.02.004

    Article  Google Scholar 

  28. Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., Woźniak, M.: Accurate and fast URL phishing detector: a convolutional neural network approach. Comput. Netw. 178, 107275 (2020). https://doi.org/10.1016/j.comnet.2020.107275

    Article  Google Scholar 

  29. Anupam, S., Kar, A.K.: Phishing website detection using support vector machines and nature-inspired optimization algorithms. Telecommun. Syst. 76(1), 17–32 (2020). https://doi.org/10.1007/s11235-020-00739-w

    Article  Google Scholar 

  30. Moorthy, R.S., Pabitha, P.: Optimal detection of phising attack using SCA based K-NN. Procedia Comput. Sci. 171, 1716–1725 (2020). https://doi.org/10.1016/j.procs.2020.04.184

    Article  Google Scholar 

  31. Deep Learning Based-Phishing Attack Detection. IJRTE, 8(3), 8428–8432 (2019). https://doi.org/10.35940/ijrte.C6527.098319

  32. Li, Q., Cheng, M., Wang, J., Sun, B.: LSTM based phishing detection for big email data. IEEE Trans. Big Data, 1 (2020). https://doi.org/10.1109/TBDATA.2020.2978915

  33. Suryan, A., Kumar, C., Mehta, M., Juneja, R., Sinha, A.: Learning model for phishing website detection. EAI Endorsed Trans. Scalable Inf. Syst. 7(27), Art. no. 27 (2020). https://doi.org/10.4108/eai.13-7-2018.163804

  34. Azari, A., Namayanja, J.M., Kaur, N., Misal, V., Shukla, S.: Imbalanced Learning in Massive Phishing Datasets. In: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). May 2020, pp. 127–132 (2020). https://doi.org/10.1109/BigDataSecurity-HPSC-IDS49724.2020.00032

  35. Huang, Y., Yang, Q., Qin, J., Wen, W.: Phishing URL detection via CNN and attention-based hierarchical RNN. In: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science And Engineering (TrustCom/BigDataSE), Aug. 2019, pp. 112–119 (2019). https://doi.org/10.1109/TrustCom/BigDataSE.2019.00024

  36. Zhu, E., Ye, C., Liu, D., Liu, F., Wang, F., Li, X.: An effective neural network phishing detection model based on optimal feature selection. In: 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), December 2018, pp. 781–787 (2018). https://doi.org/10.1109/BDCloud.2018.00117

  37. Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: URL2Vec: URL modeling with character embeddings for fast and accurate phishing website detection. In: 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), December 2018, pp. 265–272 (2018). https://doi.org/10.1109/BDCloud.2018.00050

  38. Chawathe, S.: Improving email security with fuzzy rules. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/ 12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), August 2018, pp. 1864–1869 (2018). https://doi.org/10.1109/TrustCom/BigDataSE.2018.00282

  39. Qamar, A., Karim, A., Chang, V.: Mobile malware attacks: review, taxonomy and future directions. Futur. Gener. Comput. Syst. 97, 887–909 (2019). https://doi.org/10.1016/j.future.2019.03.007

    Article  Google Scholar 

  40. Aldweesh, A., Derhab, A., Emam, A.Z.: Deep learning approaches for anomaly-based intrusion detection systems: a survey, taxonomy, and open issues. Knowl.-Based Syst. 189, 105124 (2020). https://doi.org/10.1016/j.knosys.2019.105124

    Article  Google Scholar 

  41. Faker, O., Dogdu, E.: Intrusion detection using big data and deep learning techniques. In: Proceedings of the 2019 ACM Southeast Conference, New York, April 2019, pp. 86–93 (2019). https://doi.org/10.1145/3299815.3314439

Download references

Acknowledgement

This work was supported/funded by the Ministry of Higher Education under the Fundamental Research Grant Scheme (FRGS/1/2018/ICT04/UTM/01/1). The authors sincerely thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876, for the completion of the research.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Quang, D.N., Selamat, A., Krejcar, O. (2021). Recent Research on Phishing Detection Through Machine Learning Algorithm. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12798. Springer, Cham. https://doi.org/10.1007/978-3-030-79457-6_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79457-6_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79456-9

  • Online ISBN: 978-3-030-79457-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics