Improving Convolutional Neural Network-Based Webshell Detection Through Reinforcement Learning

Wu, Yalun; Song, Minglu; Li, Yike; Tian, Yunzhe; Tong, Endong; Niu, Wenjia; Jia, Bowei; Huang, Haixiang; Li, Qiong; Liu, Jiqiang

doi:10.1007/978-3-030-86890-1_21

Yalun Wu¹²,
Minglu Song¹²,
Yike Li¹²,
Yunzhe Tian¹²,
Endong Tong¹²,
Wenjia Niu¹²,
Bowei Jia¹²,
Haixiang Huang¹²,
Qiong Li¹² &
…
Jiqiang Liu¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12918))

Included in the following conference series:

International Conference on Information and Communications Security

1894 Accesses
3 Citations

Abstract

Webshell detection is highly important for network security protection. Conventional methods are based on keywords matching, which heavily relies on experiences of domain experts when facing emerging malicious webshells of various kinds. Recently, machine learning, especially supervised learning, is introduced for webshell detection and has proved to be a great success. As one of state-of-the-art work, neural network (NN) is designed to input a large number of features and enable deep learning. Thus, how to properly combine the advantages of automatic feature selection and the advantages of expert knowledge-based way has become a key issue. Considering that special features to indicate unexpected webshell behaviors for a target business system are usually simple but effective, in this work, we propose a novel approach for improving webshell detection based on convolutional neural network (CNN) through reinforcement learning. We utilize the reinforcement learning of asynchronous advantage actor-critic (A3C) for automatic feature selection, aiming to maximize the expected accuracy of the CNN classifier on a validation dataset by sequentially interacting with the feature space. Moreover, considering the sparseness of feature values, we build the CNN classifier with two convolutional layers and a global pooling. Extensive experiments and analysis have been conducted to demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ai, Z., Luktarhan, N., Zhao, Y., Tang, C.: Ws-lsmr: malicious webshell detection algorithm based on ensemble learning. IEEE Access 8, 75785–75797 (2020)
Article Google Scholar
Ben-Porat, U., Bremler-Barr, A., Levy, H.: Vulnerability of network mechanisms to sophisticated ddos attacks. IEEE Trans. Comput. 62(5), 1031–1043 (2012)
Article MathSciNet Google Scholar
Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M.M., Lavoie, Y., Tawbi, N., et al.: Static detection of malicious code in executable programs. Int. J. Req. Eng. 2001(184–189), 79 (2001)
Google Scholar
Deng, L.Y., Lee, D.L., Chen, Y.H., Yann, L.X.: Lexical analysis for the webshell attacks. In: 2016 International Symposium on Computer, Consumer and Control (IS3C), pp. 579–582. IEEE (2016)
Google Scholar
Fushiki, T.: Estimation of prediction error by using k-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)
Article MathSciNet Google Scholar
Gong, L., Ji, R.: What does a textcnn learn? arXiv preprint arXiv:1801.06287 (2018)
Haq, T., Zhai, J., Pidathala, V.K.: Advanced persistent threat (apt) detection center (Apr 18 2017), uS Patent 9,628,507
Google Scholar
Jinping, L., Zhi, T., Jian, M., Zhiling, G., Jiemin, Z.: Mixed-models method based on machine learning in detecting webshell attack. In: Proceedings of the 2020 International Conference on Computers, Information Processing and Advanced Education, pp. 251–259 (2020)
Google Scholar
Kang, W., Zhong, S., Chen, K., Lai, J., Xu, G.: RF-AdaCost: webshell detection method that combines statistical features and opcode. In: Xu, G., Liang, K., Su, C. (eds.) FCS 2020. CCIS, vol. 1286, pp. 667–682. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-9739-8_49
Chapter Google Scholar
Kim, J., Yoo, D.H., Jang, H., Jeong, K.: Webshark 1.0: A benchmark collection for malicious web shell detection. JIPS 11(2), 229–238 (2015)
Google Scholar
Le, V.-G., Nguyen, H.-T., Lu, D.-N., Nguyen, N.-H.: A solution for automatically malicious web shell and web application vulnerability detection. In: Nguyen, N.-T., Manolopoulos, Y., Iliadis, L., Trawiński, B. (eds.) ICCCI 2016. LNCS (LNAI), vol. 9875, pp. 367–378. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45243-2_34
Chapter Google Scholar
Matsuda, W., Fujimoto, M., Mitsunaga, T.: Real-time detection system against malicious tools by monitoring dll on client computers. In: 2019 IEEE Conference on Application, Information and Network Security (AINS), pp. 36–41. IEEE (2019)
Google Scholar
Mingkun, X., Xi, C., Yan, H.: Design of software to search asp web shell. Procedia Eng. 29, 123–127 (2012)
Article Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International conference on machine learning, pp. 1928–1937. PMLR (2016)
Google Scholar
Nguyen, N.H., Le, V.H., Phung, V.O., Du, P.H.: Toward a deep learning approach for detecting php webshell. In: Proceedings of the Tenth International Symposium on Information and Communication Technology, pp. 514–521 (2019)
Google Scholar
Qi, L., Kong, R., Lu, Y., Zhuang, H.: An end-to-end detection method for webshell with deep learning. In: 2018 Eighth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC), pp. 660–665. IEEE (2018)
Google Scholar
Qin, X., Peng, S., Yang, X., Yao, Y.D.: Deep learning based channel code recognition using textcnn. In: 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), pp. 1–5. IEEE (2019)
Google Scholar
Salois, M., Charpentier, R.: Dynamic detection of malicious code in cots software. Technical Report, DEFENCE RESEARCH ESTABLISHMENT VALCARTIER (QUEBEC) (2000)
Google Scholar
Sun, X., Ma, X., Ni, Z., Bian, L.: A new lSTM network model combining TextCNN. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11301, pp. 416–424. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04167-0_38
Chapter Google Scholar
Sun, X., Lu, X., Dai, H.: A matrix decomposition based webshell detection method. In: Proceedings of the 2017 International Conference on Cryptography, Security and Privacy, pp. 66–70 (2017)
Google Scholar
Šuteva, N., Mileva, A., Loleski, M.: Computer forensic analisys of some web attacks. In: World Congress on Internet Security (WorldCIS-2014), pp. 42–47. IEEE (2014)
Google Scholar
Tian, Y., Wang, J., Zhou, Z., Zhou, S.: Cnn-webshell: malicious web shell detection with convolutional neural network. In: Proceedings of the 2017 VI International Conference on Network, Communication and Computing, pp. 75–79 (2017)
Google Scholar
Tianmin, G., Jiemin, Z., Jian, M.: Research on webshell detection method based on machine learning. In: 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE), pp. 1391–1394. IEEE (2019)
Google Scholar
Walkowiak, T., Datko, S., Maciejewski, H.: Bag-of-words, bag-of-topics and word-to-vec based subject classification of text documents in polish - a comparative study. In: Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J. (eds.) DepCoS-RELCOMEX 2018. AISC, vol. 761, pp. 526–535. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91446-6_49
Chapter Google Scholar
Wu, Y., Sun, Y., Huang, C., Jia, P., Liu, L.: Session-based webshell detection using machine learning in web logs. Secur. Commun. Netw. 2019, 11 p. (2019). Article ID 3093809. https://doi.org/10.1155/2019/3093809
Yang, W., Sun, B., Cui, B.: A webshell detection technology based on HTTP traffic analysis. In: Barolli, L., Xhafa, F., Javaid, N., Enokido, T. (eds.) IMIS 2018. AISC, vol. 773, pp. 336–342. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-93554-6_31
Chapter Google Scholar
Zhang, H., et al.: Webshell traffic detection with character-level features based on deep learning. IEEE Access 6, 75268–75277 (2018)
Article Google Scholar
Zhongzheng, X., Luktarhan, N.: Webshell detection with byte-level features based on deep learning. J. Intell. Fuzzy Syst. (Preprint) 40(1), 1585–1596 (2021)
Google Scholar

Download references

Acknowledgment

The work was supported by the National Natural Science Foundation of China under Grant Nos. 61972025, 61802389, 61672092, U1811264, and 61966009, the National Key R&D Program of China under Grant Nos. 2020YFB1005604 and 2020YFB2103802.

Author information

Authors and Affiliations

Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University, Beijing, 100044, China
Yalun Wu, Minglu Song, Yike Li, Yunzhe Tian, Endong Tong, Wenjia Niu, Bowei Jia, Haixiang Huang, Qiong Li & Jiqiang Liu

Authors

Yalun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Minglu Song
View author publications
You can also search for this author in PubMed Google Scholar
Yike Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhe Tian
View author publications
You can also search for this author in PubMed Google Scholar
Endong Tong
View author publications
You can also search for this author in PubMed Google Scholar
Wenjia Niu
View author publications
You can also search for this author in PubMed Google Scholar
Bowei Jia
View author publications
You can also search for this author in PubMed Google Scholar
Haixiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qiong Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiqiang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Endong Tong or Wenjia Niu .

Editor information

Editors and Affiliations

Singapore Management University, Singapore, Singapore
Debin Gao
Tsinghua University, Beijing, China
Qi Li
Xi'an Jiaotong University, Xi'an, China
Xiaohong Guan
Chongqing University, Chongqing, China
Xiaofeng Liao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Y. et al. (2021). Improving Convolutional Neural Network-Based Webshell Detection Through Reinforcement Learning. In: Gao, D., Li, Q., Guan, X., Liao, X. (eds) Information and Communications Security. ICICS 2021. Lecture Notes in Computer Science(), vol 12918. Springer, Cham. https://doi.org/10.1007/978-3-030-86890-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-86890-1_21
Published: 17 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86889-5
Online ISBN: 978-3-030-86890-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics