Evolutionary Optimization of Neuro-Symbolic Integration for Phishing URL Detection

Park, Kyoung-Won; Bu, Seok-Jun; Cho, Sung-Bae

doi:10.1007/978-3-030-86271-8_8

Kyoung-Won Park¹³,
Seok-Jun Bu¹⁴ &
Sung-Bae Cho^13,14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12886))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1406 Accesses
3 Citations

Abstract

A phishing attack is defined as a type of cybersecurity attack that uses URLs that lead to phishing sites and steals credentials and personal information. Since there is a limitation on traditional deep learning to detect phishing URLs from only the linguistic features of URLs, attempts have been made to detect the misclassified URLs by integrating security expert knowledge with deep learning. In this paper, a genetic algorithm is proposed to find combinatorial optimization of logic programmed constraints and deep learning from given 13 components, which are 12 rule-based symbol components and a neural component. The genetic algorithm explores numerous searching spaces of combinations of 12 rules with deep learning to get an optimal combination of the components. Experiments and 10-fold cross-validation with three different real-world datasets show that the proposed method outperforms the state-of-the-art performance of \(\beta \)-discrepancy integration approach by achieving a 1.47% accuracy and a 2.82% recall improvement. In addition, a post-analysis of the proposed method is performed to justify the feasibility of phishing URL detection via analyzing URLs that are misclassified from either the neural or symbolic networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Smadi, S., Aslam, N., Zhang, L.: Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis. Support Syst. 107, 88–102 (2018)
Article Google Scholar
Almomani, A., Gupta, B.B., Wan, L.T.C., Altaher, A., Manickam, S.: Phishing dynamic evolving neural fuzzy framework for online detection “zero-day” phishing email. Indian J. Sci. Technol. 6(1), 1–5 (2013). https://doi.org/10.17485/ijst/2013/v6i1.18
Article Google Scholar
Ojugo, A.A., Yoro, R.E.: Forging a deep learning neural network intrusion detection framework to curb the distributed denial of service attack. Int. J. Electr. Comput. Eng. (IJECE) 11(2), 1498 (2021). https://doi.org/10.11591/ijece.v11i2.pp1498-1509
Article Google Scholar
Moghimi, M., Varjani, A.Y.: New rule-based phishing detection method. Expert Syst. Appl. 53, 231–242 (2016)
Article Google Scholar
Liu, W., Zhong, S.: Web malware spread modelling and optimal control strategies. Sci. Rep. 7, 1–19 (2017)
Google Scholar
Anand, A., Gorde, K., Moniz, J.R.A., Park, N., Chakraborty, T., Chu, B.T.: Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1168–1177 (2018)
Google Scholar
Yadollahi, M.M., Shoeleh, F., Serkani, E., Madani, A., Gharaee, H.: An adaptive machine learning based approach for phishing detection using hybrid features. In: 2019 5th International Conference on Web Research (ICWR), pp. 281–286 (2019)
Google Scholar
Mamun, M.S.I., Rathore, M.A., Lashkari, A.H., Stakhanova, N., Ghorbani, A.A.: Detecting malicious URLS using lexical analysis. In: International Conference on Network and System Security, pp. 467–482 (2020)
Google Scholar
Subasi, A., Kremic, E.: Comparison of adaboost with multiboosting for phishing website detection. Procedia Comput. Sci. 168, 272–278 (2020)
Article Google Scholar
Burnap, P., French, R., Turner, F., Jones, K.: Malware classification using self organising feature maps and machine activity data. Comput. Secur. 73, 399–410 (2018)
Article Google Scholar
Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: URLNet: learning a URL representation with deep learning for malicious URL detection (2018). arXiv preprint: arXiv:1802.03162
Yang, P., Zhao, G., Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7, 15196–15209 (2019)
Article Google Scholar
Huang, Y., Yang, Q., Qin, J., Wen, W.: Phishing URL detection via CNN and attention-based hierarchical RNN. In: 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 112–119 (2019)
Google Scholar
Tajaddodianfar, F., Stokes, J.W., Gururajan, A.: Texception: a character/word-level deep learning model for phishing URL detection. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2857–2861 (2020)
Google Scholar
Bu, S.J., Cho, S.B.: Integrating deep learning with first-order logic programmed constraints for zero-day phishing attack detection. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2685–2689 (2021)
Google Scholar
Wang, W., Pan, S.J.: Integrating deep learning with logic fusion for information extraction. Proc. AAAI Conf. Artif. Intell. 34, 9225–9232 (2020)
Google Scholar
Mohammad, R.M., Thabtah, F., McCluskey, L.: An assessment of features related to phishing websites using an automated technique. In: 2012 International Conference for Internet Technology and Secured Transactions, pp. 492–497 (2012)
Google Scholar
Korkmaz, M., Sahingoz, O.K., Diri, B.: Feature selections for the classification of webpages to detect phishing attacks: a survey. In: 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1–9 (2020)
Google Scholar
Zhang, Q., Deng, D., Dai, W., Li, J., Jin, X.: Optimization of culture conditions for differentiation of melon based on artificial neural network and genetic algorithm. Sci. Rep. 10, 1–8 (2020)
Google Scholar
Afan, H.A., et al.: Input attributes optimization using the feasibility of genetic nature inspired algorithm: Application of river flow forecasting. Sci. Rep. 10, 1–15 (2020)
Article Google Scholar
Cho, S.B., Shimohara, K.: Evolutionary learning of modular neural networks with genetic programming. Appl. Intell. 9(3), 191–200 (1998)
Article Google Scholar
Lee, S.I., Cho, S.B.: Emergent behaviors of a fuzzy sensory-motor controller evolved by genetic algorithm. IEEE Trans. Syst. Man. Cybern. Part B (Cybern.) 31(6), 919–929 (2001)
Article Google Scholar

Download references

Acknowledgement

This work was supported by an IITP grant funded by the Korean MSIT (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and a grant funded by Air Force Research Laboratory, USA.

Author information

Authors and Affiliations

Department of Artificial Intelligence, Yonsei University, Seoul, 03722, Korea
Kyoung-Won Park & Sung-Bae Cho
Department of Computer Science, Yonsei University, Seoul, 03722, Korea
Seok-Jun Bu & Sung-Bae Cho

Authors

Kyoung-Won Park
View author publications
You can also search for this author in PubMed Google Scholar
Seok-Jun Bu
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Bae Cho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Kyoung-Won Park , Seok-Jun Bu or Sung-Bae Cho .

Editor information

Editors and Affiliations

University of Deusto, Bilbao, Spain
Hugo Sanjurjo González
University of Deusto, Bilbao, Spain
Iker Pastor López
University of Deusto, Bilbao, Spain
Pablo García Bringas
University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, KW., Bu, SJ., Cho, SB. (2021). Evolutionary Optimization of Neuro-Symbolic Integration for Phishing URL Detection. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-86271-8_8
Published: 15 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86270-1
Online ISBN: 978-3-030-86271-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics