A Character-Level BiGRU-Attention for Phishing Classification

Yuan, Lijuan; Zeng, Zhiyong; Lu, Yikang; Ou, Xiaofeng; Feng, Tao

doi:10.1007/978-3-030-41579-2_43

Lijuan Yuan¹²,
Zhiyong Zeng¹³,
Yikang Lu¹²,
Xiaofeng Ou¹⁴ &
…
Tao Feng¹³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11999))

Included in the following conference series:

International Conference on Information and Communications Security

2643 Accesses
7 Citations

Abstract

Online phishing usually tricks victims by showing fake information which is similar to the legitimate one, so that the phishers could elevate their privileges. In order to guard users from fraudulent information and minimize the loss caused by visiting phishing websites, a variety of methods have been developed to filter out phishing websites. At present, there are several phishing detection methods continually being updated, but the experimental results of them are not enough satisfactory. To fill these gaps, an improved model based on attention mechanism bi-directional gated recurrent unit, named BiGRU-Attention model, will be introduced. The basic mechanism of this model is that it obtains the characters before and after a particular character through the BiGRU, and then calculates score for that character by the Attention. Since the final score depends on the composition of the input, the more similar between phishing and legitimate websites, the more difficult it is to be distinguished. By utilizing this model, most of the phishing URLs will be tested out. Also, an explanation of why phishing and legal websites can be distinguished will be given. Based on the experimental results, the BiGRU-Attention model achieves an accuracy of 99.55%, and the F1-score is 99.54%. Besides, the effectiveness of deep neural network in anti-phishing application and cybersecurity will be demonstrated. Keywords Phishing Detection, BiGRU-Attention Model, Important Characters, The Difference Between similar URLs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dhamija, R., Tygar, J.D., Hearst, M.: Why phishing works. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, vol. 1–2, pp. 581–590. ACM (2006)
Google Scholar
APWG: Phishing Activity Trends Report, 4rd Quarter 2018, Technical report. December 2018
Google Scholar
Jeeva, S.C., Rajsingh, E.B.: Intelligent phishing url detection using association rule mining. Hum. Centric Comput. Inf. Sci. 6(1), 10 (2016)
Article Google Scholar
Fang, Y., Zhang, C., Huang, C., et al.: Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 7, 56329–56340 (2019)
Article Google Scholar
Zhang, J., Porras, P.A., Ullrich J.: Highly predictive blacklisting. In: Proceedings of USENIX Security Symposium, pp. 107–122 (2008)
Google Scholar
Zhuang, W.W., Jiang, Q.S., Xiong T.K.: An intelligent antiphishing strategy model for phishing website detection. In: Proceedings of 32nd International Conference on Distributed Computing Systems Workshops, pp. 51–56. IEEE (2012)
Google Scholar
Sheng, S., Wardman, B., Warner, G., Cranor, L., Hong, J., Zhang, C.: An empirical analysis of phishing blacklists. In: Proceedings of 6th Conference Email Anti-Spam (CEAS), Sacramento, CA, USA, pp. 59–78 (2009)
Google Scholar
Zouina, M., Outtaj, B.: A novel lightweight URL phishing detection system using SVM and similarity index. Hum. Centric Comput. Inf. Sci. 7(1), 1–13 (2017). https://doi.org/10.1186/s13673-017-0098-1
Article Google Scholar
Chiew, K.L., Tan, C.L., Wong, K., Yong, K.S., Tiong, W.K.: A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf. Sci. 484, 153–166 (2019)
Article Google Scholar
Bahnsen, A.C., Bohorquez, E.C., Villegas, S., Vargas, J., Gonzlez, F.A.: Classifying phishing URLs using recurrent neural networks. In: Proc of 2017 APWG Symposium on Electronic Crime Research (eCrime), pp. 1–8. IEEE (2017)
Google Scholar
Marchal, S., Saari K., Singh N., Asokan, N.: Know your phish: novel techniques for detecting phishing sites and their targets. In: Proceedings of 36th International Conference on Distributed Computing Systems (ICDCS), pp. 323–333. IEEE (2016)
Google Scholar
Feroz, M.N., Menge,l.S.: Phishing URL detection using URL ranking. In: Proceedings IEEE International Congress on Big Data, pp. 635–638. IEEE (2015)
Google Scholar
Aydin, M., Baykal, N.: Feature extraction and classification phishing websites based on URL. In: Proceedings of IEEE Conference on Communications and Network Security (CNS), pp. 769–770. IEEE (2015)
Google Scholar
Kp, S., et al.: A short review on applications of deep learning for cyber security. arXiv preprint arXiv:1812.06292 (2018)
Zouina, M., Outtaj, B.: A novel lightweight URL phishing detection system using SVM and similarity index. Human-centric Computing and Information Sciences, vol. 7, p. 17. Springer Open, Netherlands (2017)
Google Scholar
Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019)
Article Google Scholar
Saxe J., Berlin K.: eXpose: a character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys. arXiv preprint arXiv: 1702.08568 (2017)
Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: URLNet: learning a URL representation with deep learning for malicious URL detection, arXiv preprint arXiv:1802.03162 (2018)
Vazhayil, A., Vinayakumar, R., Soman, K.P.: Comparative study of the detection of malicious urls using shallow and deep networks. In: Proceedings of 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE (2018)
Google Scholar
Yang, W., Zuo, W., Cui, B.: Detecting malicious URLs via a keyword-based convolutional gated-recurrent-unit neural network. IEEE Access 7, 29891–29900 (2019)
Article Google Scholar
Cho, K., vanMerrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014). arXiv:1409.1259
Cui, B., He, S., Yao, X., Shi, P.: Malicious URL detection with feature extraction based on machine learning. Int. J. High Perform. Comput. Netw. 12, 166–178 (2018)
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv:1409.0473 (2014)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, vol. 2016, pp. 1480–1489. Human Language Technologies, North American (2016)
Google Scholar
Li, H., Min, M.R., Ge, Y., Kadav, A.: A context-aware attention network for interactive question answering. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 927–935. ACM (2017)
Google Scholar
Wang, X., et al.: Dynamic attention deep model for article recommendation by learning human editors demonstration. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2051–2059. ACM (2017)
Google Scholar
Mnih, V., Heess, N., Graves, A. et al.: Recurrent models of visual attention. In: Proceedings of Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, 650221, Yunnan, China
Lijuan Yuan & Yikang Lu
School of Information, Yunnan University of Finance and Economics, Kunming, 650221, Yunnan, China
Zhiyong Zeng & Tao Feng
Shanghai Jiao Tong University Yunnan Research Institute, Dali, 671000, Yunnan, China
Xiaofeng Ou

Authors

Lijuan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Yikang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Ou
View author publications
You can also search for this author in PubMed Google Scholar
Tao Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Feng .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Jianying Zhou
The Hong Kong Polytechnic University, Kowloon, Hong Kong
Xiapu Luo
Peking University, Beijing, China
Qingni Shen
Institute of Information Engineering, Beijing, China
Zhen Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, L., Zeng, Z., Lu, Y., Ou, X., Feng, T. (2020). A Character-Level BiGRU-Attention for Phishing Classification. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds) Information and Communications Security. ICICS 2019. Lecture Notes in Computer Science(), vol 11999. Springer, Cham. https://doi.org/10.1007/978-3-030-41579-2_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-41579-2_43
Published: 18 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41578-5
Online ISBN: 978-3-030-41579-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics