To read this content please select one of the options below:

Identification of phishing websites through hyperlink analysis and rule extraction

Chaoqun Wang (School of Information Management, Wuhan University, Wuhan, China)
Zhongyi Hu (School of Information Management, Wuhan University, Wuhan, China and Center for E-commerce Research and Development, Wuhan University, Wuhan, China)
Raymond Chiong (School of Electrical Engineering and Computing, The University of Newcastle, Callaghan, Australia)
Yukun Bao (School of Management, Huazhong University of Science and Technology, Wuhan, China)
Jiang Wu (School of Information Management, Wuhan University, Wuhan, China and Center for E-commerce Research and Development, Wuhan University, Wuhan, China)

The Electronic Library

ISSN: 0264-0473

Article publication date: 27 November 2020

Issue publication date: 12 December 2020

346

Abstract

Purpose

The aim of this study is to propose an efficient rule extraction and integration approach for identifying phishing websites. The proposed approach can elucidate patterns of phishing websites and identify them accurately.

Design/methodology/approach

Hyperlink indicators along with URL-based features are used to build the identification model. In the proposed approach, very simple rules are first extracted based on individual features to provide meaningful and easy-to-understand rules. Then, the F-measure score is used to select high-quality rules for identifying phishing websites. To construct a reliable and promising phishing website identification model, the selected rules are integrated using a simple neural network model.

Findings

Experiments conducted using self-collected and benchmark data sets show that the proposed approach outperforms 16 commonly used classifiers (including seven non–rule-based and four rule-based classifiers as well as five deep learning models) in terms of interpretability and identification performance.

Originality/value

Investigating patterns of phishing websites based on hyperlink indicators using the efficient rule-based approach is innovative. It is not only helpful for identifying phishing websites, but also beneficial for extracting simple and understandable rules.

Keywords

Acknowledgements

This work was supported by the Natural Science Foundation of China (Grant Numbers 71601147 and 71874131) and the China Postdoctoral Science Foundation (Grant Numbers 2019T120690 and 2015M582280).

Citation

Wang, C., Hu, Z., Chiong, R., Bao, Y. and Wu, J. (2020), "Identification of phishing websites through hyperlink analysis and rule extraction", The Electronic Library, Vol. 38 No. 5/6, pp. 1073-1093. https://doi.org/10.1108/EL-01-2020-0016

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles