Abstract
Phishing evolves rapidly nowadays, causing much damage to finance, brand reputation, and privacy. Various phishing detection methods have been proposed along with the rise of phishing, but there are still research issues. Phishing websites mainly steal users’ information through visual deception and deep learning methods have been proved very effective in computer vision applications but there is a lack in the research on visual analysis using deep learning algorithms. Moreover, most research use balanced datasets, which is not the case in a real Web environment. Therefore, this paper proposes a security indicator area (SIA) which contains most security indicators that are designed to help users identify phishing sites. The proposed method then takes screenshots of SIA and uses a convolutional neural network (CNN) as a classifier. To prove the efficiency of the proposed method, this paper carries out several comparative experiments on an unbalanced dataset with much fewer phishing sites, which increases detection difficulty but also makes the detection closer to reality. The results show that the proposed method achieves the highest F1-score among the compared methods, while providing advantages on detection efficiency and data expansibility in phishing detection.
Similar content being viewed by others
References
Tweneboah-Koduah, S., Skouby, K.E., Tadayoni, R.: Cyber security threats to IoT applications and service domains. Wireless Pers. Commun. 95(1), 169–185 (2017)
Ponemon.: The Cost of Phishing & Value of Employee Training. https://info.wombatsecurity.com/hubfs/Ponemon_Institute_Cost_of_Phishing.pdf?t=1467214861789
NSFOCUS.: Phishing lecture hall Part2:Phishing risks (losses from attacks). http://blog.nsfocus.net/phishing-attack-risk/
Nirmal, K., Janet, B., Kumar, R.: Analyzing and eliminating phishing threats in IoT, network and other Web applications using iterative intersection. Peer-to-Peer Networking and Applications, pp. 1–13 (2020)
V, E.: Phishing Trends & Intelligence Report: The Growing Social Engineering Threat. https://info.phishlabs.com/2019-pti-report-evolving-threat
Microsoft.: Microsoft Security Intelligence Report Volume 24. https://info.microsoft.com/%20%20ww-landing-M365-SIR-v24-Report-eBook.HTML
Geng, G.G., Lee, X.D., Zhang, Y.M.: Combating phishing attacks via brand identity and authorization features. Secur. Commun. Netw. 8(6), 888–898 (2015)
Chiew, K.L., Chang, E.H., Sze, S.N., Tiong, W.K.: Utilisation of website logo for phishing detection. Comput. Secur. 54, 16–26 (2015)
Moghimi, M., Varjani, A.Y.: New rule-based phishing detection method. Expert Syst. Appl. 53, 231–242 (2016)
Rao, R., Pais, A.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 01(31), 3851–3873 (2018)
Jain, A., Gupta, B.B.: Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 12(68), 687–700 (2017)
Sahingoz, O., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 01(117), 345–357 (2019)
Abbas, A., Singh, S., Kau, M.: Detection of Phishing Websites Using Machine Learning, pp. 1307–1314. Springer, New York (2020)
Gastellier-Prevost, S., Granadillo, G.G., Laurent, M.: Decisive Heuristics to Differentiate Legitimate from Phishing Sites. In: 2011 Conference on Network and Information Systems Security, pp. 1–9 (2011)
Geng, G., Yan, Z., Zeng, Y., Jin, X.: RRPhish: Anti-phishing via mining brand resources request. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–2 (2018)
Zhang, X., Shen, C., Chen, Y., Wu, X., Liu, C.: An analysis of intelligent acousitic system. Front. Data Comput. 6, 98–109 (2019)
Kreuk, F., Adi, Y., Cisse, M., Keshet, J.: Fooling end-to-end speaker verification with adversarial examples. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1962–1966 (2018)
Serdyuk, D., Audhkhasi, K., Brakel, P., Ramabhadran, B., Thomas, S., Bengio, Y.: Invariant Representations for Noisy Speech Recognition. In: 30th Conference on Neural Information Processing Systems (NIPS 2016) (2016)
Jiang, F., Fu, Y., Gupta, B.B., Liang, Y., Rho, S., Lou, F., et al.: Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans. Sustain. Comput. 5(2), 204–212 (2020)
Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2017)
Subasi, A., Molah, E., Almkallawi, F., Chaudhery, T.: Intelligent phishing website detection using random forest classifier. In: 2017 International Conference on Electrical and Computing Technologies and Applications (ICECTA), pp. 1–5 (2017)
Parekh, S., Parikh, D., Kotak, S., Sankhe, P.: A New Method for Detection of Phishing Websites: URL Detection. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 949–952 (2018)
Babagoli, M., Aghababa, M., Solouk, V.: Heuristic nonlinear regression strategy for detecting phishing websites. Soft. Comput. 02(23), 4315–4327 (2018)
Rodríguez, J., García, V., Castillo, N.P.: Webpages Classification with Phishing Content Using Naive Bayes Algorithm, pp. 249–258. Springer, New York (2019)
Wei, B., Hamad, R., Yang, L., He, X., Wang, H., Gao, B., et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 09(19), 4258 (2019)
Chen, W., Zhang, W., Su, Y.: Phishing Detection Research Based on LSTM Recurrent Neural Network. In: International Conference of Pioneering Computer Scientists, Engineers and Educators (ICPCSEE 2018) (2018)
Hiransha, M., Unnithan, N.A., Vinayakumar, R., Soman, K., Verma, A.: Deep learning based phishing e-mail detection. In: Proc. 1st AntiPhishing Shared Pilot 4th ACM Int. Workshop Secur. Privacy Anal.(IWSPA) (2018)
Cuzzocrea, A., Martinelli, F., Mercaldo, F.: A machine-learning framework for supporting intelligent web-phishing detection and analysis. In: IDEAS ’19: Proceedings of the 23rd International Database Applications & Engineering Symposium (2019)
Alex, K., Ilya, S., Hg, E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS, IEEE, Neural Information Processing System Foundation. 01(25), 1097–1105 (2012)
Liang, Y., Deng, J., Cui, B.: Bidirectional LSTM: An Innovative Approach for Phishing URL Identification, pp. 326–337 (2020)
Tajaddodianfar, F., Stokes, J., Gururajan, A.: Texception: A Character/Word-Level Deep Learning Model for Phishing URL Detection. In: ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2857–2861 (2020)
PhishLabs.: PhishLabs 2017 Phishing Trends & Intelligence Report. https://www.phishlabs.com/phishlabs-2017-phishing-trends-intelligence-report-hacking-the-human/
W3Techs.: Usage statistics of Default protocol https for websites. https://w3techs.com/technologies/details/ce-httpsdefault.2020
Liu, D., Lee, J.H.: CNN based malicious website detection by invalidating multiple web spams. IEEE Access 05(8), 97258–97266 (2020)
Bisong E. In: TensorFlow 2.0 and Keras. Apress; 2019. p. 347–399
Manaswi, K.N.: Understanding and Working with Keras. Apress, pp. 31–43 (2018)
Liu, D., Lee, J.: CNN based malicious website detection by invalidating multiple web spams. IEEE Access 8, 97258–97266 (2020)
Aljofey, J., Jiang, Q., Rasool, A., Chen, H., Liu, W., Qu, Q., Wang, Y.: An effective detection approach for phishing websites using URL and HTML features. Sci. Rep. 12(1), 8842 (2022)
Lokesh, G.H., BoreGowda, G.: Phishing website detection based on effective machine learning approach. J. Cyber Secur. Technol. 5, 1–14 (2021)
Alshehri, M., Abugabah, A., Algarni, M., Almotairi, S.: Character-level word encoding deep learning model for combating cyber threats in phishing URL detection. Comput. Electr. Eng. 100, 107868 (2022)
Dilhara, S., Phishing, U.R.L.: Detection: a novel hybrid approach using long short-term memory and gated recurrent units. Int. J. Comput. Appl. 183, 41–54 (2021)
Zhang, Q., Bu, Y., Chen, B., Zhang, S., Lu, X.: Research on phishing webpage detection technology based on cnn-bilstm algorithm. J. Phys. 1738, 012131 (2021)
Al-Ahmadi, S., Lasloum, T.: PDMLP: phishing detection using multilayer perceptron. Int. J. Netw. Secur. Appl. 12, 59–72 (2020)
Xu, P.: A Transformer-based Model to Detect Phishing URLs. J. Phys. Conf. Ser. (2021). arXiv preprint arXiv:2109.02138
Funding
This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2021-0-00796, Research on Foundational Technologies for 6G Autonomous Security-by-Design to Guarantee Constant Quality of Security).
Author information
Authors and Affiliations
Contributions
Liu analyzed visual counterfeiting of phishing websites, and proposed the Security Indicator Area (SIA) as an input, which utilizes visual analysis and makes the input interpretable. Liu and Lee carried out several comparative experiments on a constructed unbalanced dataset. Liu and Lee reviewed the manuscript. Lee is the corresponding author of this paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, DJ., Lee, JH. A CNN-Based SIA Screenshot Method to Visually Identify Phishing Websites. J Netw Syst Manage 32, 8 (2024). https://doi.org/10.1007/s10922-023-09784-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10922-023-09784-7