Abstract
A phishing attack is defined as a type of cybersecurity attack that uses URLs that lead to phishing sites and steals credentials and personal information. Since there is a limitation on traditional deep learning to detect phishing URLs from only the linguistic features of URLs, attempts have been made to detect the misclassified URLs by integrating security expert knowledge with deep learning. In this paper, a genetic algorithm is proposed to find combinatorial optimization of logic programmed constraints and deep learning from given 13 components, which are 12 rule-based symbol components and a neural component. The genetic algorithm explores numerous searching spaces of combinations of 12 rules with deep learning to get an optimal combination of the components. Experiments and 10-fold cross-validation with three different real-world datasets show that the proposed method outperforms the state-of-the-art performance of \(\beta \)-discrepancy integration approach by achieving a 1.47% accuracy and a 2.82% recall improvement. In addition, a post-analysis of the proposed method is performed to justify the feasibility of phishing URL detection via analyzing URLs that are misclassified from either the neural or symbolic networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smadi, S., Aslam, N., Zhang, L.: Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis. Support Syst. 107, 88–102 (2018)
Almomani, A., Gupta, B.B., Wan, L.T.C., Altaher, A., Manickam, S.: Phishing dynamic evolving neural fuzzy framework for online detection “zero-day” phishing email. Indian J. Sci. Technol. 6(1), 1–5 (2013). https://doi.org/10.17485/ijst/2013/v6i1.18
Ojugo, A.A., Yoro, R.E.: Forging a deep learning neural network intrusion detection framework to curb the distributed denial of service attack. Int. J. Electr. Comput. Eng. (IJECE) 11(2), 1498 (2021). https://doi.org/10.11591/ijece.v11i2.pp1498-1509
Moghimi, M., Varjani, A.Y.: New rule-based phishing detection method. Expert Syst. Appl. 53, 231–242 (2016)
Liu, W., Zhong, S.: Web malware spread modelling and optimal control strategies. Sci. Rep. 7, 1–19 (2017)
Anand, A., Gorde, K., Moniz, J.R.A., Park, N., Chakraborty, T., Chu, B.T.: Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1168–1177 (2018)
Yadollahi, M.M., Shoeleh, F., Serkani, E., Madani, A., Gharaee, H.: An adaptive machine learning based approach for phishing detection using hybrid features. In: 2019 5th International Conference on Web Research (ICWR), pp. 281–286 (2019)
Mamun, M.S.I., Rathore, M.A., Lashkari, A.H., Stakhanova, N., Ghorbani, A.A.: Detecting malicious URLS using lexical analysis. In: International Conference on Network and System Security, pp. 467–482 (2020)
Subasi, A., Kremic, E.: Comparison of adaboost with multiboosting for phishing website detection. Procedia Comput. Sci. 168, 272–278 (2020)
Burnap, P., French, R., Turner, F., Jones, K.: Malware classification using self organising feature maps and machine activity data. Comput. Secur. 73, 399–410 (2018)
Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: URLNet: learning a URL representation with deep learning for malicious URL detection (2018). arXiv preprint: arXiv:1802.03162
Yang, P., Zhao, G., Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7, 15196–15209 (2019)
Huang, Y., Yang, Q., Qin, J., Wen, W.: Phishing URL detection via CNN and attention-based hierarchical RNN. In: 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 112–119 (2019)
Tajaddodianfar, F., Stokes, J.W., Gururajan, A.: Texception: a character/word-level deep learning model for phishing URL detection. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2857–2861 (2020)
Bu, S.J., Cho, S.B.: Integrating deep learning with first-order logic programmed constraints for zero-day phishing attack detection. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2685–2689 (2021)
Wang, W., Pan, S.J.: Integrating deep learning with logic fusion for information extraction. Proc. AAAI Conf. Artif. Intell. 34, 9225–9232 (2020)
Mohammad, R.M., Thabtah, F., McCluskey, L.: An assessment of features related to phishing websites using an automated technique. In: 2012 International Conference for Internet Technology and Secured Transactions, pp. 492–497 (2012)
Korkmaz, M., Sahingoz, O.K., Diri, B.: Feature selections for the classification of webpages to detect phishing attacks: a survey. In: 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1–9 (2020)
Zhang, Q., Deng, D., Dai, W., Li, J., Jin, X.: Optimization of culture conditions for differentiation of melon based on artificial neural network and genetic algorithm. Sci. Rep. 10, 1–8 (2020)
Afan, H.A., et al.: Input attributes optimization using the feasibility of genetic nature inspired algorithm: Application of river flow forecasting. Sci. Rep. 10, 1–15 (2020)
Cho, S.B., Shimohara, K.: Evolutionary learning of modular neural networks with genetic programming. Appl. Intell. 9(3), 191–200 (1998)
Lee, S.I., Cho, S.B.: Emergent behaviors of a fuzzy sensory-motor controller evolved by genetic algorithm. IEEE Trans. Syst. Man. Cybern. Part B (Cybern.) 31(6), 919–929 (2001)
Acknowledgement
This work was supported by an IITP grant funded by the Korean MSIT (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and a grant funded by Air Force Research Laboratory, USA.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Park, KW., Bu, SJ., Cho, SB. (2021). Evolutionary Optimization of Neuro-Symbolic Integration for Phishing URL Detection. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-86271-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86270-1
Online ISBN: 978-3-030-86271-8
eBook Packages: Computer ScienceComputer Science (R0)