Abstract
To address the problems that traditional multi-category text classification algorithms generally have high dimensionality of text vectorization representation, do not consider the importance of words to the overall text, and weak semantic feature information extraction. A multi-category text classification model based on Weighted Word2vec, BiLSTM and Attention mechanism (Weight-Text-Classification-Model, WTCM) is proposed. First, the text is vectorized by the Word2vec model; then the weight value of each word is calculated by the TF-IDF algorithm and multiplied with the word vector to construct a weighted text vector representation; then the semantic feature information is extracted by the context-dependent capability of BiLSTM; the Attention mechanism layer is incorporated after the BiLSTM layer to assign weights to the output of each moment After the BiLSTM layer, an Attention mechanism layer is incorporated to assign weights to the sequence information output at each moment; finally, it is input to the softmax classifier for multi-category text classification. The experimental results show that the classification accuracy, recall and F-value of the WTCM model are as high as 91.26%, 90.98% and 91.12%, which can effectively solve the multi-category text classification problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Thompson, P.: Looking back: on relevance, probabilistic indexing and information retrieval. Inf. Process. Manage. 44, 963–970 (2008)
Tang, Y., Lin, L., Luo, Y.M., Pan, Y.: Advances in information retrieval technology based on automatic question and answer systems. Comput. Appl. 28, 2745–2748 (2008)
Hecai, L., Yonghong, L.: A framework design for sentiment disposition analysis service in natural language processing. Fujian Comput. 35, 76–77 (2019)
Lu, W., Li, Z., Zhu, C.: Research on spam filtering based on plain Bayesian algorithm. Sens. Microsyst. 39, 46–48,52 (2020)
Sharma, N., Singh, M.: Modifying Naive Bayes classifier for multinomial text classification. In: International Conference on Recent Advances & Innovations in Engineering (2017)
Ding, S.T., Lu, J., Hong, H.F., Huang, A., Guo, Z.Y.: Design and implementation of SVM-based text multiple choice classification system. Comput. Digit. Eng. 48, 147–152 (2020)
Zhou, Y., Li, Y., Xia, S.: An improved KNN text classification algorithm based on clustering. J. Comput. 4(3), 230–237 (2009)
Min, Z., Chen, F.S.: An active Bayesian text sentiment classification method combined with sentiment dictionary. J. Huaqiao Univ. (Nat. Sci. Edn.) 39, 623–626 (2018)
Zheng, F., Wei, D.-T., Huang, S.: A text classification method based on LDA and deep learning. Comput. Eng. Des. 41, 2184–2189 (2020)
Deng, J., Sun, S., Wang, R., Song, X., Li, H.: Sentiment evolution analysis of microblog public opinion based on Word2Vec and SVM. Intell. Theor. Pract. 43, 112–119 (2020)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014)
Wang, H., Song, W., Wang, F.: A text classification method based on a hybrid model of LSTM and CNN. Small Microcomput. Syst. 41, 1163–1168 (2020)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by joint learning to align and translate. arXiv:1409.0473 (2014)
Yang, Z., Yang, D., Dyer, C., He, X., Hovy, E.: Hierarchical attention networks for document classification. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2017)
Lan, W.F., Xu, W.W., Wang, D.Z., Pan, P.C.: LSTM-attention based Chinese news text classification. J. Zhongnan Univ. Nationalities (Nat. Sci. Edn.) 37, 129–133 (2018)
Zhao, Y., Duan, Y.: A text classification model based on attention mechanism for convolutional neural networks. J. Appl. Sci. 37, 541–550 (2019)
Acknowledgement
This work was supported in part by Major Research and Development Projects of Anhui Province Key R&D Program Project (201904e01020015); Qinghai Province Key R&D and Transformation Program (2020-QY-213); Qinghai Province Basic Research Program Project (2020-ZJ-913); National Key R&D Program Subproject (2017YFD0301303).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, H., He, Z., Zhang, W., Hu, Y., Wu, Y., Yue, Y. (2021). Multi-class Text Classification Model Based on Weighted Word Vector and BiLSTM-Attention Optimization. In: Huang, DS., Jo, KH., Li, J., Gribova, V., Bevilacqua, V. (eds) Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science(), vol 12836. Springer, Cham. https://doi.org/10.1007/978-3-030-84522-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-84522-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84521-6
Online ISBN: 978-3-030-84522-3
eBook Packages: Computer ScienceComputer Science (R0)