Recent Research on Phishing Detection Through Machine Learning Algorithm

Quang, Do Nguyet; Selamat, Ali; Krejcar, Ondrej

doi:10.1007/978-3-030-79457-6_42

Do Nguyet Quang¹²,
Ali Selamat^12,13,14 &
Ondrej Krejcar¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12798))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2005 Accesses
3 Citations

Abstract

The rapid growth of emerging technologies, smart devices, 5G communication, etc. have contributed to the accumulation of data, hence introducing the big data era. Big data imposes a variety of challenges associated with machine learning, especially in phishing detection. Therefore, this paper aims to provide an analysis and summary of current research in phishing detection through machine learning for big data. To achieve this goal, this study adopted a systematic literature review (SLR) technique and critically analyzed a total of 30 papers from various journals and conference proceedings. These papers were selected from previous studies in five different databases on content published between 2018 and January 2021. The results obtained from this study reveal a limited number of research works that comprehensively reviewed the feasibility of applying both machine learning and big data technologies in the context of phishing detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Systematic Review on Phishing Detection: A Perspective Beyond a High Accuracy in Phishing Detection

An Intelligent Phishing Detection Scheme Using Machine Learning

Classification of Phishing Attack Solutions by Employing Deep Learning Techniques: A Systematic Literature Review

References

von Solms, R., van Niekerk, J.: From information security to cyber security. Comput. Secur. 38, 97–102 (2013). https://doi.org/10.1016/j.cose.2013.04.004
Article Google Scholar
Jang-Jaccard, J., Nepal, S.: A survey of emerging threats in cybersecurity. J. Comput. Syst. Sci. 80(5), 973–993 (2014). https://doi.org/10.1016/j.jcss.2014.02.005
Article MathSciNet MATH Google Scholar
Kitchenham, O.B., Brereton, P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering – a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009). https://doi.org/10.1016/j.infsof.2008.09.009
Article Google Scholar
Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., Group, T.P.: Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement PLOS Med. 6(7), e1000097 (2009). https://doi.org/10.1371/journal.pmed.1000097
Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering - EASE 2014, London, England, United Kingdom, pp. 1–10 (2014). https://doi.org/10.1145/2601248.2601268
Zhou, X., Jin, Y., Zhang, H., Li, S., Huang, X.: A map of threats to validity of systematic literature reviews in software engineering. In: 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), pp. 153–160, December 2016. https://doi.org/10.1109/APSEC.2016.031
Orabi, M., Mouheb, D., Al Aghbari, Z., Kamel, I.: Detection of bots in social media: a systematic review. Inf. Proc. Manage. 57(4), p. 102250 (2020). https://doi.org/10.1016/j.ipm.2020.102250
Lim, K.C., Selamat, A., Alias, R.A., Krejcar, O., Fujita, H.: Usability measures in mobile-based augmented reality learning applications: a systematic review. Appl. Sci. 9(13), Art. no. 13, (2019). https://doi.org/10.3390/app9132718
Qabajeh, I., Thabtah, F., Chiclana, F.: A recent review of conventional vs. automated cybersecurity anti-phishing techniques. Comput. Sci. Rev. 29, 44–55 (2018). https://doi.org/10.1016/j.cosrev.2018.05.003
Article Google Scholar
Amanullah, M.A., et al.: Deep learning and big data technologies for IoT security. Comput. Commun. 151, 495–517 (2020). https://doi.org/10.1016/j.comcom.2020.01.016
Article Google Scholar
Zhu, E., Ju, Y., Chen, Z., Liu, F., Fang, X.: DTOF-ANN: an artificial neural network phishing detection model based on decision tree and optimal features. Appl. Soft Comput. 95, 106505 (2020). https://doi.org/10.1016/j.asoc.2020.106505
Article Google Scholar
Tan, C.L., Chiew, K.L., Yong, K.S.C., Sze, S.N., Abdullah, J., Sebastian, Y.: A graph-theoretic approach for the detection of phishing webpages. Comput. Secur. 95, 101793 (2020). https://doi.org/10.1016/j.cose.2020.101793
Article Google Scholar
Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: A Survey. Int. J. Inf. Manage. 45, 289–307 (2019). https://doi.org/10.1016/j.ijinfomgt.2018.08.006
Article Google Scholar
Dixit, P., Silakari, S.: Deep learning algorithms for cybersecurity applications: a technological and status review. Comput. Sci. Rev. 39, 100317 (2021). https://doi.org/10.1016/j.cosrev.2020.100317
Article MathSciNet Google Scholar
Mahdavifar, S., Ghorbani, A.A.: Application of deep learning to cybersecurity: a survey. Neurocomputing 347, 149–176 (2019). https://doi.org/10.1016/j.neucom.2019.02.056
Article Google Scholar
Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31(8), 3851–3873 (2018). https://doi.org/10.1007/s00521-017-3305-0
Article Google Scholar
Hota, H.S., Shrivas, A.K., Hota, R.: An Ensemble model for detecting phishing attack with proposed remove-replace feature selection technique. Procedia Comput. Sci. 132, 900–907 (2018). https://doi.org/10.1016/j.procs.2018.05.103
Article Google Scholar
Subasi, A., Kremic, E.: Comparison of adaboost with multiboosting for phishing website detection. Procedia Comput. Sci. 168, 272–278 (2020). https://doi.org/10.1016/j.procs.2020.02.251
Article Google Scholar
Janjua, F., Masood, A., Abbas, H., Rashid, I.: Handling Insider Threat Through Supervised Machine Learning Techniques. Procedia Computer Science 177, 64–71 (2020). https://doi.org/10.1016/j.procs.2020.10.012
Article Google Scholar
Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019). https://doi.org/10.1016/j.eswa.2018.09.029
Article Google Scholar
Adebowale, M.A., Lwin, K.T., Sánchez, E., Hossain, M.A.: Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text. Expert Syst. Appl. 115, 300–313 (2019). https://doi.org/10.1016/j.eswa.2018.07.067
Article Google Scholar
Mahdavifar, S., Ghorbani, A.A.: DeNNeS: deep embedded neural network expert system for detecting cyber attacks. Neural Comput. Appl. 32(18), 14753–14780 (2020). https://doi.org/10.1007/s00521-020-04830-w
Article Google Scholar
Zhu, H.: Online meta-learning firewall to prevent phishing attacks. Neural Comput. Appl. 32(23), 17137–17147 (2020). https://doi.org/10.1007/s00521-020-05041-z
Article Google Scholar
Zhu, E., Chen, Y., Ye, C., Li, X., Liu, F.: OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7, 73271–73284 (2019). https://doi.org/10.1109/ACCESS.2019.2920655
Article Google Scholar
Orunsolu, A.A., Sodiya, A.S., Akinwale, A.T.: A predictive model for phishing detection. J. King Saud Univ. – Comput. Inf. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.12.005
Article Google Scholar
Ding, Y., Luktarhan, N., Li, K., Slamu, W.: A keyword-based combination approach for detecting phishing webpages. Comput. Secur. 84, 256–275 (2019). https://doi.org/10.1016/j.cose.2019.03.018
Article Google Scholar
Liew, S.W., Sani, N.F.M., Abdullah, M.T., Yaakob, R., Sharum, M.Y.: An effective security alert mechanism for real-time phishing tweet detection on Twitter. Comput. Secur. 83, 201–207 (2019). https://doi.org/10.1016/j.cose.2019.02.004
Article Google Scholar
Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., Woźniak, M.: Accurate and fast URL phishing detector: a convolutional neural network approach. Comput. Netw. 178, 107275 (2020). https://doi.org/10.1016/j.comnet.2020.107275
Article Google Scholar
Anupam, S., Kar, A.K.: Phishing website detection using support vector machines and nature-inspired optimization algorithms. Telecommun. Syst. 76(1), 17–32 (2020). https://doi.org/10.1007/s11235-020-00739-w
Article Google Scholar
Moorthy, R.S., Pabitha, P.: Optimal detection of phising attack using SCA based K-NN. Procedia Comput. Sci. 171, 1716–1725 (2020). https://doi.org/10.1016/j.procs.2020.04.184
Article Google Scholar
Deep Learning Based-Phishing Attack Detection. IJRTE, 8(3), 8428–8432 (2019). https://doi.org/10.35940/ijrte.C6527.098319
Li, Q., Cheng, M., Wang, J., Sun, B.: LSTM based phishing detection for big email data. IEEE Trans. Big Data, 1 (2020). https://doi.org/10.1109/TBDATA.2020.2978915
Suryan, A., Kumar, C., Mehta, M., Juneja, R., Sinha, A.: Learning model for phishing website detection. EAI Endorsed Trans. Scalable Inf. Syst. 7(27), Art. no. 27 (2020). https://doi.org/10.4108/eai.13-7-2018.163804
Azari, A., Namayanja, J.M., Kaur, N., Misal, V., Shukla, S.: Imbalanced Learning in Massive Phishing Datasets. In: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). May 2020, pp. 127–132 (2020). https://doi.org/10.1109/BigDataSecurity-HPSC-IDS49724.2020.00032
Huang, Y., Yang, Q., Qin, J., Wen, W.: Phishing URL detection via CNN and attention-based hierarchical RNN. In: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science And Engineering (TrustCom/BigDataSE), Aug. 2019, pp. 112–119 (2019). https://doi.org/10.1109/TrustCom/BigDataSE.2019.00024
Zhu, E., Ye, C., Liu, D., Liu, F., Wang, F., Li, X.: An effective neural network phishing detection model based on optimal feature selection. In: 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), December 2018, pp. 781–787 (2018). https://doi.org/10.1109/BDCloud.2018.00117
Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: URL2Vec: URL modeling with character embeddings for fast and accurate phishing website detection. In: 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), December 2018, pp. 265–272 (2018). https://doi.org/10.1109/BDCloud.2018.00050
Chawathe, S.: Improving email security with fuzzy rules. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/ 12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), August 2018, pp. 1864–1869 (2018). https://doi.org/10.1109/TrustCom/BigDataSE.2018.00282
Qamar, A., Karim, A., Chang, V.: Mobile malware attacks: review, taxonomy and future directions. Futur. Gener. Comput. Syst. 97, 887–909 (2019). https://doi.org/10.1016/j.future.2019.03.007
Article Google Scholar
Aldweesh, A., Derhab, A., Emam, A.Z.: Deep learning approaches for anomaly-based intrusion detection systems: a survey, taxonomy, and open issues. Knowl.-Based Syst. 189, 105124 (2020). https://doi.org/10.1016/j.knosys.2019.105124
Article Google Scholar
Faker, O., Dogdu, E.: Intrusion detection using big data and deep learning techniques. In: Proceedings of the 2019 ACM Southeast Conference, New York, April 2019, pp. 86–93 (2019). https://doi.org/10.1145/3299815.3314439

Download references

Acknowledgement

This work was supported/funded by the Ministry of Higher Education under the Fundamental Research Grant Scheme (FRGS/1/2018/ICT04/UTM/01/1). The authors sincerely thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876, for the completion of the research.

Author information

Authors and Affiliations

Malaysia-Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Do Nguyet Quang & Ali Selamat
School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Johor Bahru, Malaysia and Media and Games Center of Excellence (MagicX), Universiti Teknologi Malaysia, Johor Bahru, Malaysia
Ali Selamat
Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove, Rokitanskeho 62, 500 03, Hradec Kralove, Czech Republic
Ali Selamat & Ondrej Krejcar

Authors

Do Nguyet Quang
View author publications
You can also search for this author in PubMed Google Scholar
Ali Selamat
View author publications
You can also search for this author in PubMed Google Scholar
Ondrej Krejcar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

i-SOMET Incorporate Association, Morioka, Japan
Hamido Fujita
Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Ali Selamat
Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Texas State University San Marcos, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quang, D.N., Selamat, A., Krejcar, O. (2021). Recent Research on Phishing Detection Through Machine Learning Algorithm. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12798. Springer, Cham. https://doi.org/10.1007/978-3-030-79457-6_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-79457-6_42
Published: 19 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79456-9
Online ISBN: 978-3-030-79457-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics