Abstract
Cyber threat intelligence (CTI) refers to essential knowledge used by organizations to prevent or mitigate against cyber attacks. Vulnerability databases such as CVE and NVD are crucial to cyber threat intelligence, but also provide information leveraged in hundreds of security products worldwide. However, previous studies have shown that these vulnerability databases sometimes contain errors and inconsistencies which have to be manually checked by security professionals. Such inconsistencies could threaten the integrity of security products and hamper attack mitigation efforts. Hence, to assist the security community with more accurate and time-saving validation of vulnerability data, we propose an automated vulnerability classification system based on deep learning. Our proposed system utilizes a self-attention deep neural network (SA-DNN) model and text mining approach to identify the vulnerability category from the description text contained within a report. The performance of the SA-DNN-based vulnerability classification system is evaluated using 134,091 vulnerability reports from the CVE details website.The experiments performed demonstrates the effectiveness of our approach, and shows that the SA-DNN model outperforms SVM and other deep learning methods i.e. CNN-LSTM and graph convolutional neural networks.
Similar content being viewed by others
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(1), 993–1022 (2003)
Colas, F., Brazdil, P.: Comparison of SVM and Some Older Classification Algorithms in Text Classification Tasks, pp. 169–178. Springer, Berlin (2006)
Dong, Y., Guo, W., Chen, Y., Xing, X., Zhang, Y., Wang, G. (eds.): Towards the Detection of Inconsistencies in Public Security Vulnerability Reports. Presented at the (2019)
Goseva-Popstojanova, K., Tyo, J.: Identification of security related bug reports via text mining using supervised and unsupervised classification. Presented at the (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, G., Li, Y., Wang, Q., Ren, J., Cheng, Y., Zhao, X.: Automatic classification method for software vulnerability based on deep neural network. IEEE Access 7, 28291–28298 (2019)
Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features, pp. 137–142. Springer, Berlin (1998)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Manek, A.S., Deepa Shenoy, P., Chandra Mohan, M., Venugopal, K.R.: Aspect term extraction for sentiment analysis in large movie reviews using gini index feature selection method and svm classifier. World Wide Web 20(2), 135–154 (2017)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res 12, 2825–2830 (2011)
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. Presented at the (2014)
Ramay, W.Y., Umer, Q., Yin, X.C., Zhu, C., Illahi, I.: Deep neural network-based severity prediction of bug reports. IEEE Access 7, 46846–46857 (2019)
Russo, E.R., Di Sorbo, A., Visaggio, C.A., Canfora, G.: Summarizing vulnerabilities’ descriptions to support experts during vulnerability assessment activities. J. Syst. Softw. 156, 84–99 (2019)
Terdchanakul, P., Hata, H., Phannachitta, P., Matsumoto, K.: Bug or not? bug report classification using n-gram idf. Presented at the (2017)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science & Business Media, Berlin (2013)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Wijayasekara, D., Manic, M., McQueen, M.: Vulnerability identification and classification via text mining bug databases. Presented at the (2014)
Wu, G., Tang, G., Wang, Z., Zhang, Z., Wang, Z.: An attention-based bilstm-crf model for Chinese clinic named entity recognition. IEEE Access 7, 113942–113949 (2019)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. Presented at the (2016)
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. Proc. AAAI Conf. Artif. Intell. 33, 7370–7377 (2019)
Yenter, A., Verma, A.: Deep cnn-lstm with combined kernels from multiple branches for imdb review sentiment analysis. Presented at the (2017)
Yin, W., Kann, K., Yu, M., Schütze, H.: Comparative study of cnn and rnn for natural language processing. arXiv preprint arXiv:1702.01923 (2017)
Zhang, S., Caragea, D., Xinming, O.: An Empirical Study on Using the National Vulnerability Database to Predict Software Vulnerabilities, pp. 217–231. Springer, Berlin (2011)
Zhou, Y., Tong, Y., Ruihang, G., Gall, H.: Combining text mining and data mining for bug report classification. J. Softw. 28(3), 150–176 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vishnu, P.R., Vinod, P. & Yerima, S.Y. A Deep Learning Approach for Classifying Vulnerability Descriptions Using Self Attention Based Neural Network. J Netw Syst Manage 30, 9 (2022). https://doi.org/10.1007/s10922-021-09624-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10922-021-09624-6