An Improved Ensemble Deep Learning Model Based on CNN for Malicious Website Detection

Do, Nguyet Quang; Selamat, Ali; Lim, Kok Cheng; Krejcar, Ondrej

doi:10.1007/978-3-031-08530-7_42

Nguyet Quang Do¹¹,
Ali Selamat^11,12,13,
Kok Cheng Lim¹⁴ &
…
Ondrej Krejcar^11,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13343))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1543 Accesses

Abstract

A malicious website, also known as a phishing website, remains one of the major concerns in the cybersecurity domain. Among numerous deep learning-based solutions for phishing website detection, a Convolutional Neural Network (CNN) is one of the most popular techniques. However, when used as a stand-alone classifier, CNN still suffers from an accuracy deficiency issue. Therefore, the main objective of this paper is to explore the hybridization of CNN with another deep learning algorithm to address this problem. In this study, CNN was combined with Bidirectional Gated Recurrent Unit (BiGRU) to construct an ensemble model for malicious webpage classification. The performance of the proposed CNN-BiGRU model was evaluated against several deep learning approaches using the same dataset. The results indicated that the proposed CNN-BiGRU is a promising solution for malicious website detection. In addition, ensemble architectures outperformed single models as they joined the advantages and cured the disadvantages of individual deep learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., Woźniak, M.: Accurate and fast URL phishing detector: a convolutional neural network approach. Comput. Netw. 178 (2020). https://doi.org/10.1016/j.comnet.2020.107275
Feng, J., Zou, L., Yang, Y., Han, O., Zhou, J.: Web2Vec: phishing webpage detection method based on multidimensional features driven by deep learning. IEEE Access. 8, (2020). https://doi.org/10.1109/ACCESS.2020.3043188
Xiao, X., Zhang, D., Hu, G., Jiang, Y., Xia, S.: CNN–MHSA: a Convolutional Neural Network and multi-head self-attention combined approach for detecting phishing websites. Neural Netw. 125, 303–312 (2020). https://doi.org/10.1016/j.neunet.2020.02.013
Article Google Scholar
Adebowale, M.A., Lwin, K.T., Hossain, M.A.: Intelligent phishing detection scheme using deep learning algorithms. J. Enterp. Inf. Manag. (2020). https://doi.org/10.1108/JEIM-01-2020-0036
Article Google Scholar
Liu, D., Lee, J., Wang, W., Wang, Y.: Malicious Websites Detection via CNN based Screenshot Recognition*. 115–119 (2019)
Google Scholar
Huang, Y., Yang, Q., Qin, J., Wen, W.: Phishing URL detection via CNN and attention-based hierarchical RNN. Proc. - 2019 18th IEEE Int. Conf. Trust. Secur. Priv. Comput. Commun. IEEE Int. Conf. Big Data Sci. Eng. Trust. 112–119 (2019). https://doi.org/10.1109/TrustCom/BigDataSE.2019.00024
Al-Ahmadi, S., Alharbi, Y.: A deep learning technique for web phishing detection combined URL features and visual similarity. Int. J. Comput. Netw. Commun. 12, 41–54 (2020). https://doi.org/10.5121/ijcnc.2020.12503
Article Google Scholar
Srinivasan, S., Vidyapeetham, A.V., Ravi, V., Arunachalam, A., Universitet, O., Alazab, M.: Malware analysis using artificial intelligence and deep learning. Malware Anal. Using Artif. Intell. Deep Learn. (2021). https://doi.org/10.1007/978-3-030-62582-5
Rasymas, T., Dovydaitis, L.: Detection of phishing URLs by using deep learning approach and multiple features combinations. Balt. J. Mod. Comput. 8, 471–483 (2020). https://doi.org/10.22364/BJMC.2020.8.3.06
Yuan, L., Zeng, Z., Lu, Y., Ou, X., Feng, T.: A character-level bigru-attention for phishing classification. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds.) ICICS 2019. LNCS, vol. 11999, pp. 746–762. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41579-2_43
Chapter Google Scholar
Ozcan, A., Catal, C., Donmez, E., Senturk, B.: A hybrid DNN–LSTM model for detecting phishing URLs. Neural Comput. Appl. (2021)https://doi.org/10.1007/s00521-021-06401-z
Quang, D.N., Selamat, A., Krejcar, O.: Recent research on phishing detection through machine learning algorithm. In: Fujita, H., Selamat, A., Lin, J.-W., Ali, M. (eds.) IEA/AIE 2021. LNCS (LNAI), vol. 12798, pp. 495–508. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79457-6_42
Chapter Google Scholar
Do, N.Q., Selamat, A., Krejcar, O., Yokoi, T., Fujita, H.: Phishing webpage classification via deep learning‐based algorithms: an empirical study. Appl. Sci. 11 (2021). https://doi.org/10.3390/app11199210

Download references

Acknowledgement

The authors sincerely thank Universiti Teknologi Malaysia (UTM) under Malaysia Research University Network (MRUN) Vot 4L876, for the completion of the research. This work was also partially supported/funded by the Ministry of Higher Education under the Fundamental Research Grant Scheme (FRGS/1/2018/ICT04/UTM/01/1) and Universiti Tenaga Nasional (UNITEN). The work and the contribution were also supported by the SPEV project “Smart Solutions in Ubiquitous Computing Environments”, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic (under ID: UHK-FIM-SPEV-2022–2102). We are also grateful for the support of student Michal Dobrovolny in consultations regarding application aspects.

Author information

Authors and Affiliations

Malaysia-Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Nguyet Quang Do, Ali Selamat & Ondrej Krejcar
School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Johor Bahru, Malaysia and Media and Games Center of Excellence (MagicX), Universiti Teknologi Malaysia, Johor Bahru, Malaysia
Ali Selamat
Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove, Rokitanskeho 62, 500 03, Hradec Kralove, Czech Republic
Ali Selamat & Ondrej Krejcar
College of Computing and Informatics (CCI), Universiti Tenaga Nasional (UNITEN), Kajang, Malaysia
Kok Cheng Lim

Authors

Nguyet Quang Do
View author publications
You can also search for this author in PubMed Google Scholar
Ali Selamat
View author publications
You can also search for this author in PubMed Google Scholar
Kok Cheng Lim
View author publications
You can also search for this author in PubMed Google Scholar
Ondrej Krejcar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Selamat .

Editor information

Editors and Affiliations

i-SOMET, Inc., Morioka-shi, Iwate, Japan
Hamido Fujita
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, China
Philippe Fournier-Viger
Texas State University, San Marcos, TX, USA
Moonis Ali
Shanghai University of Finance and Economics, Shanghai, China
Yinglin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Do, N.Q., Selamat, A., Lim, K.C., Krejcar, O. (2022). An Improved Ensemble Deep Learning Model Based on CNN for Malicious Website Detection. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-08530-7_42
Published: 30 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics