A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites

Liu, Dong-Jie; Lee, Jong-Hyouk

doi:10.1007/s10922-023-09784-7

A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites

Published: 21 November 2023

Volume 32, article number 8, (2024)
Cite this article

Journal of Network and Systems Management Aims and scope Submit manuscript

Dong-Jie Liu¹ &
Jong-Hyouk Lee²^na1

177 Accesses
1 Citation
Explore all metrics

Abstract

Phishing evolves rapidly nowadays, causing much damage to finance, brand reputation, and privacy. Various phishing detection methods have been proposed along with the rise of phishing, but there are still research issues. Phishing websites mainly steal users’ information through visual deception and deep learning methods have been proved very effective in computer vision applications but there is a lack in the research on visual analysis using deep learning algorithms. Moreover, most research use balanced datasets, which is not the case in a real Web environment. Therefore, this paper proposes a security indicator area (SIA) which contains most security indicators that are designed to help users identify phishing sites. The proposed method then takes screenshots of SIA and uses a convolutional neural network (CNN) as a classifier. To prove the efficiency of the proposed method, this paper carries out several comparative experiments on an unbalanced dataset with much fewer phishing sites, which increases detection difficulty but also makes the detection closer to reality. The results show that the proposed method achieves the highest F1-score among the compared methods, while providing advantages on detection efficiency and data expansibility in phishing detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

Image forgery detection: a survey of recent deep-learning approaches

Article Open access 03 October 2022

Deepfakes: current and future trends

Article Open access 19 February 2024

References

Tweneboah-Koduah, S., Skouby, K.E., Tadayoni, R.: Cyber security threats to IoT applications and service domains. Wireless Pers. Commun. 95(1), 169–185 (2017)
Article Google Scholar
Ponemon.: The Cost of Phishing & Value of Employee Training. https://info.wombatsecurity.com/hubfs/Ponemon_Institute_Cost_of_Phishing.pdf?t=1467214861789
NSFOCUS.: Phishing lecture hall Part2:Phishing risks (losses from attacks). http://blog.nsfocus.net/phishing-attack-risk/
Nirmal, K., Janet, B., Kumar, R.: Analyzing and eliminating phishing threats in IoT, network and other Web applications using iterative intersection. Peer-to-Peer Networking and Applications, pp. 1–13 (2020)
V, E.: Phishing Trends & Intelligence Report: The Growing Social Engineering Threat. https://info.phishlabs.com/2019-pti-report-evolving-threat
Microsoft.: Microsoft Security Intelligence Report Volume 24. https://info.microsoft.com/%20%20ww-landing-M365-SIR-v24-Report-eBook.HTML
Geng, G.G., Lee, X.D., Zhang, Y.M.: Combating phishing attacks via brand identity and authorization features. Secur. Commun. Netw. 8(6), 888–898 (2015)
Article Google Scholar
Chiew, K.L., Chang, E.H., Sze, S.N., Tiong, W.K.: Utilisation of website logo for phishing detection. Comput. Secur. 54, 16–26 (2015)
Article Google Scholar
Moghimi, M., Varjani, A.Y.: New rule-based phishing detection method. Expert Syst. Appl. 53, 231–242 (2016)
Article Google Scholar
Rao, R., Pais, A.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 01(31), 3851–3873 (2018)
Google Scholar
Jain, A., Gupta, B.B.: Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 12(68), 687–700 (2017)
Google Scholar
Sahingoz, O., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 01(117), 345–357 (2019)
Article Google Scholar
Abbas, A., Singh, S., Kau, M.: Detection of Phishing Websites Using Machine Learning, pp. 1307–1314. Springer, New York (2020)
Google Scholar
Gastellier-Prevost, S., Granadillo, G.G., Laurent, M.: Decisive Heuristics to Differentiate Legitimate from Phishing Sites. In: 2011 Conference on Network and Information Systems Security, pp. 1–9 (2011)
Geng, G., Yan, Z., Zeng, Y., Jin, X.: RRPhish: Anti-phishing via mining brand resources request. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–2 (2018)
Zhang, X., Shen, C., Chen, Y., Wu, X., Liu, C.: An analysis of intelligent acousitic system. Front. Data Comput. 6, 98–109 (2019)
Google Scholar
Kreuk, F., Adi, Y., Cisse, M., Keshet, J.: Fooling end-to-end speaker verification with adversarial examples. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1962–1966 (2018)
Serdyuk, D., Audhkhasi, K., Brakel, P., Ramabhadran, B., Thomas, S., Bengio, Y.: Invariant Representations for Noisy Speech Recognition. In: 30th Conference on Neural Information Processing Systems (NIPS 2016) (2016)
Jiang, F., Fu, Y., Gupta, B.B., Liang, Y., Rho, S., Lou, F., et al.: Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans. Sustain. Comput. 5(2), 204–212 (2020)
Article Google Scholar
Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2017)
Article Google Scholar
Subasi, A., Molah, E., Almkallawi, F., Chaudhery, T.: Intelligent phishing website detection using random forest classifier. In: 2017 International Conference on Electrical and Computing Technologies and Applications (ICECTA), pp. 1–5 (2017)
Parekh, S., Parikh, D., Kotak, S., Sankhe, P.: A New Method for Detection of Phishing Websites: URL Detection. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 949–952 (2018)
Babagoli, M., Aghababa, M., Solouk, V.: Heuristic nonlinear regression strategy for detecting phishing websites. Soft. Comput. 02(23), 4315–4327 (2018)
Google Scholar
Rodríguez, J., García, V., Castillo, N.P.: Webpages Classification with Phishing Content Using Naive Bayes Algorithm, pp. 249–258. Springer, New York (2019)
Wei, B., Hamad, R., Yang, L., He, X., Wang, H., Gao, B., et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 09(19), 4258 (2019)
Article Google Scholar
Chen, W., Zhang, W., Su, Y.: Phishing Detection Research Based on LSTM Recurrent Neural Network. In: International Conference of Pioneering Computer Scientists, Engineers and Educators (ICPCSEE 2018) (2018)
Hiransha, M., Unnithan, N.A., Vinayakumar, R., Soman, K., Verma, A.: Deep learning based phishing e-mail detection. In: Proc. 1st AntiPhishing Shared Pilot 4th ACM Int. Workshop Secur. Privacy Anal.(IWSPA) (2018)
Cuzzocrea, A., Martinelli, F., Mercaldo, F.: A machine-learning framework for supporting intelligent web-phishing detection and analysis. In: IDEAS ’19: Proceedings of the 23rd International Database Applications & Engineering Symposium (2019)
Alex, K., Ilya, S., Hg, E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS, IEEE, Neural Information Processing System Foundation. 01(25), 1097–1105 (2012)
Liang, Y., Deng, J., Cui, B.: Bidirectional LSTM: An Innovative Approach for Phishing URL Identification, pp. 326–337 (2020)
Tajaddodianfar, F., Stokes, J., Gururajan, A.: Texception: A Character/Word-Level Deep Learning Model for Phishing URL Detection. In: ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2857–2861 (2020)
PhishLabs.: PhishLabs 2017 Phishing Trends & Intelligence Report. https://www.phishlabs.com/phishlabs-2017-phishing-trends-intelligence-report-hacking-the-human/
W3Techs.: Usage statistics of Default protocol https for websites. https://w3techs.com/technologies/details/ce-httpsdefault.2020
Liu, D., Lee, J.H.: CNN based malicious website detection by invalidating multiple web spams. IEEE Access 05(8), 97258–97266 (2020)
Article Google Scholar
Bisong E. In: TensorFlow 2.0 and Keras. Apress; 2019. p. 347–399
Manaswi, K.N.: Understanding and Working with Keras. Apress, pp. 31–43 (2018)
Liu, D., Lee, J.: CNN based malicious website detection by invalidating multiple web spams. IEEE Access 8, 97258–97266 (2020)
Article Google Scholar
Aljofey, J., Jiang, Q., Rasool, A., Chen, H., Liu, W., Qu, Q., Wang, Y.: An effective detection approach for phishing websites using URL and HTML features. Sci. Rep. 12(1), 8842 (2022)
Article Google Scholar
Lokesh, G.H., BoreGowda, G.: Phishing website detection based on effective machine learning approach. J. Cyber Secur. Technol. 5, 1–14 (2021)
Article Google Scholar
Alshehri, M., Abugabah, A., Algarni, M., Almotairi, S.: Character-level word encoding deep learning model for combating cyber threats in phishing URL detection. Comput. Electr. Eng. 100, 107868 (2022)
Article Google Scholar
Dilhara, S., Phishing, U.R.L.: Detection: a novel hybrid approach using long short-term memory and gated recurrent units. Int. J. Comput. Appl. 183, 41–54 (2021)
Google Scholar
Zhang, Q., Bu, Y., Chen, B., Zhang, S., Lu, X.: Research on phishing webpage detection technology based on cnn-bilstm algorithm. J. Phys. 1738, 012131 (2021)
Google Scholar
Al-Ahmadi, S., Lasloum, T.: PDMLP: phishing detection using multilayer perceptron. Int. J. Netw. Secur. Appl. 12, 59–72 (2020)
Google Scholar
Xu, P.: A Transformer-based Model to Detect Phishing URLs. J. Phys. Conf. Ser. (2021). arXiv preprint arXiv:2109.02138

Download references

Funding

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2021-0-00796, Research on Foundational Technologies for 6G Autonomous Security-by-Design to Guarantee Constant Quality of Security).

Author information

Dong-Jie Liu and Jong-Hyouk Lee have contributed equally to this work.

Authors and Affiliations

College of Cyber Security, Jinan University, Guangzhou, China
Dong-Jie Liu
Sejong University, Seoul, Republic of Korea
Jong-Hyouk Lee

Authors

Dong-Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Hyouk Lee
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Liu analyzed visual counterfeiting of phishing websites, and proposed the Security Indicator Area (SIA) as an input, which utilizes visual analysis and makes the input interpretable. Liu and Lee carried out several comparative experiments on a constructed unbalanced dataset. Liu and Lee reviewed the manuscript. Lee is the corresponding author of this paper.

Corresponding author

Correspondence to Jong-Hyouk Lee.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, DJ., Lee, JH. A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites. J Netw Syst Manage 32, 8 (2024). https://doi.org/10.1007/s10922-023-09784-7

Download citation

Received: 30 June 2022
Revised: 12 February 2023
Accepted: 19 October 2023
Published: 21 November 2023
DOI: https://doi.org/10.1007/s10922-023-09784-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Image forgery detection: a survey of recent deep-learning approaches

Deepfakes: current and future trends

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Image forgery detection: a survey of recent deep-learning approaches

Deepfakes: current and future trends

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation