Multi-class Text Classification Model Based on Weighted Word Vector and BiLSTM-Attention Optimization

Wu, Hao; He, Zhuangzhuang; Zhang, Weitao; Hu, Yunsheng; Wu, Yunzhi; Yue, Yi

doi:10.1007/978-3-030-84522-3_32

Hao Wu^13,14,
Zhuangzhuang He^13,14,
Weitao Zhang^13,14,
Yunsheng Hu^13,14,
Yunzhi Wu^13,14 &
…
Yi Yue^13,14

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12836))

Included in the following conference series:

International Conference on Intelligent Computing

1782 Accesses
5 Citations

Abstract

To address the problems that traditional multi-category text classification algorithms generally have high dimensionality of text vectorization representation, do not consider the importance of words to the overall text, and weak semantic feature information extraction. A multi-category text classification model based on Weighted Word2vec, BiLSTM and Attention mechanism (Weight-Text-Classification-Model, WTCM) is proposed. First, the text is vectorized by the Word2vec model; then the weight value of each word is calculated by the TF-IDF algorithm and multiplied with the word vector to construct a weighted text vector representation; then the semantic feature information is extracted by the context-dependent capability of BiLSTM; the Attention mechanism layer is incorporated after the BiLSTM layer to assign weights to the output of each moment After the BiLSTM layer, an Attention mechanism layer is incorporated to assign weights to the sequence information output at each moment; finally, it is input to the softmax classifier for multi-category text classification. The experimental results show that the classification accuracy, recall and F-value of the WTCM model are as high as 91.26%, 90.98% and 91.12%, which can effectively solve the multi-category text classification problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Thompson, P.: Looking back: on relevance, probabilistic indexing and information retrieval. Inf. Process. Manage. 44, 963–970 (2008)
Article Google Scholar
Tang, Y., Lin, L., Luo, Y.M., Pan, Y.: Advances in information retrieval technology based on automatic question and answer systems. Comput. Appl. 28, 2745–2748 (2008)
MATH Google Scholar
Hecai, L., Yonghong, L.: A framework design for sentiment disposition analysis service in natural language processing. Fujian Comput. 35, 76–77 (2019)
Google Scholar
Lu, W., Li, Z., Zhu, C.: Research on spam filtering based on plain Bayesian algorithm. Sens. Microsyst. 39, 46–48,52 (2020)
Google Scholar
Sharma, N., Singh, M.: Modifying Naive Bayes classifier for multinomial text classification. In: International Conference on Recent Advances & Innovations in Engineering (2017)
Google Scholar
Ding, S.T., Lu, J., Hong, H.F., Huang, A., Guo, Z.Y.: Design and implementation of SVM-based text multiple choice classification system. Comput. Digit. Eng. 48, 147–152 (2020)
Google Scholar
Zhou, Y., Li, Y., Xia, S.: An improved KNN text classification algorithm based on clustering. J. Comput. 4(3), 230–237 (2009)
Article Google Scholar
Min, Z., Chen, F.S.: An active Bayesian text sentiment classification method combined with sentiment dictionary. J. Huaqiao Univ. (Nat. Sci. Edn.) 39, 623–626 (2018)
Google Scholar
Zheng, F., Wei, D.-T., Huang, S.: A text classification method based on LDA and deep learning. Comput. Eng. Des. 41, 2184–2189 (2020)
Google Scholar
Deng, J., Sun, S., Wang, R., Song, X., Li, H.: Sentiment evolution analysis of microblog public opinion based on Word2Vec and SVM. Intell. Theor. Pract. 43, 112–119 (2020)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014)
Google Scholar
Wang, H., Song, W., Wang, F.: A text classification method based on a hybrid model of LSTM and CNN. Small Microcomput. Syst. 41, 1163–1168 (2020)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by joint learning to align and translate. arXiv:1409.0473 (2014)
Yang, Z., Yang, D., Dyer, C., He, X., Hovy, E.: Hierarchical attention networks for document classification. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2017)
Google Scholar
Lan, W.F., Xu, W.W., Wang, D.Z., Pan, P.C.: LSTM-attention based Chinese news text classification. J. Zhongnan Univ. Nationalities (Nat. Sci. Edn.) 37, 129–133 (2018)
Google Scholar
Zhao, Y., Duan, Y.: A text classification model based on attention mechanism for convolutional neural networks. J. Appl. Sci. 37, 541–550 (2019)
Google Scholar

Download references

Acknowledgement

This work was supported in part by Major Research and Development Projects of Anhui Province Key R&D Program Project (201904e01020015); Qinghai Province Key R&D and Transformation Program (2020-QY-213); Qinghai Province Basic Research Program Project (2020-ZJ-913); National Key R&D Program Subproject (2017YFD0301303).

Author information

Authors and Affiliations

Anhui Provincial Engineering Laboratory of Beidou Precision Agriculture, Anhui Agricultural University, Hefei, China
Hao Wu, Zhuangzhuang He, Weitao Zhang, Yunsheng Hu, Yunzhi Wu & Yi Yue
School of Information and Computer, Anhui Agricultural University, Hefei, 230036, Anhui, China
Hao Wu, Zhuangzhuang He, Weitao Zhang, Yunsheng Hu, Yunzhi Wu & Yi Yue

Authors

Hao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhuangzhuang He
View author publications
You can also search for this author in PubMed Google Scholar
Weitao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yunsheng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunzhi Wu .

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
De-Shuang Huang
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Shenzhen University, Shenzhen, China
Jianqiang Li
Far Eastern Branch of the Russian Academy of Sciences, Vladivostok, Russia
Valeriya Gribova
Polytechnic University of Bari, Bari, Italy
Vitoantonio Bevilacqua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, H., He, Z., Zhang, W., Hu, Y., Wu, Y., Yue, Y. (2021). Multi-class Text Classification Model Based on Weighted Word Vector and BiLSTM-Attention Optimization. In: Huang, DS., Jo, KH., Li, J., Gribova, V., Bevilacqua, V. (eds) Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science(), vol 12836. Springer, Cham. https://doi.org/10.1007/978-3-030-84522-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-84522-3_32
Published: 09 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84521-6
Online ISBN: 978-3-030-84522-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics