Skip to main content

Multi-class Text Classification Model Based on Weighted Word Vector and BiLSTM-Attention Optimization

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12836))

Included in the following conference series:

Abstract

To address the problems that traditional multi-category text classification algorithms generally have high dimensionality of text vectorization representation, do not consider the importance of words to the overall text, and weak semantic feature information extraction. A multi-category text classification model based on Weighted Word2vec, BiLSTM and Attention mechanism (Weight-Text-Classification-Model, WTCM) is proposed. First, the text is vectorized by the Word2vec model; then the weight value of each word is calculated by the TF-IDF algorithm and multiplied with the word vector to construct a weighted text vector representation; then the semantic feature information is extracted by the context-dependent capability of BiLSTM; the Attention mechanism layer is incorporated after the BiLSTM layer to assign weights to the output of each moment After the BiLSTM layer, an Attention mechanism layer is incorporated to assign weights to the sequence information output at each moment; finally, it is input to the softmax classifier for multi-category text classification. The experimental results show that the classification accuracy, recall and F-value of the WTCM model are as high as 91.26%, 90.98% and 91.12%, which can effectively solve the multi-category text classification problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Thompson, P.: Looking back: on relevance, probabilistic indexing and information retrieval. Inf. Process. Manage. 44, 963–970 (2008)

    Article  Google Scholar 

  2. Tang, Y., Lin, L., Luo, Y.M., Pan, Y.: Advances in information retrieval technology based on automatic question and answer systems. Comput. Appl. 28, 2745–2748 (2008)

    MATH  Google Scholar 

  3. Hecai, L., Yonghong, L.: A framework design for sentiment disposition analysis service in natural language processing. Fujian Comput. 35, 76–77 (2019)

    Google Scholar 

  4. Lu, W., Li, Z., Zhu, C.: Research on spam filtering based on plain Bayesian algorithm. Sens. Microsyst. 39, 46–48,52 (2020)

    Google Scholar 

  5. Sharma, N., Singh, M.: Modifying Naive Bayes classifier for multinomial text classification. In: International Conference on Recent Advances & Innovations in Engineering (2017)

    Google Scholar 

  6. Ding, S.T., Lu, J., Hong, H.F., Huang, A., Guo, Z.Y.: Design and implementation of SVM-based text multiple choice classification system. Comput. Digit. Eng. 48, 147–152 (2020)

    Google Scholar 

  7. Zhou, Y., Li, Y., Xia, S.: An improved KNN text classification algorithm based on clustering. J. Comput. 4(3), 230–237 (2009)

    Article  Google Scholar 

  8. Min, Z., Chen, F.S.: An active Bayesian text sentiment classification method combined with sentiment dictionary. J. Huaqiao Univ. (Nat. Sci. Edn.) 39, 623–626 (2018)

    Google Scholar 

  9. Zheng, F., Wei, D.-T., Huang, S.: A text classification method based on LDA and deep learning. Comput. Eng. Des. 41, 2184–2189 (2020)

    Google Scholar 

  10. Deng, J., Sun, S., Wang, R., Song, X., Li, H.: Sentiment evolution analysis of microblog public opinion based on Word2Vec and SVM. Intell. Theor. Pract. 43, 112–119 (2020)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  12. Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014)

    Google Scholar 

  13. Wang, H., Song, W., Wang, F.: A text classification method based on a hybrid model of LSTM and CNN. Small Microcomput. Syst. 41, 1163–1168 (2020)

    Google Scholar 

  14. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by joint learning to align and translate. arXiv:1409.0473 (2014)

  15. Yang, Z., Yang, D., Dyer, C., He, X., Hovy, E.: Hierarchical attention networks for document classification. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2017)

    Google Scholar 

  16. Lan, W.F., Xu, W.W., Wang, D.Z., Pan, P.C.: LSTM-attention based Chinese news text classification. J. Zhongnan Univ. Nationalities (Nat. Sci. Edn.) 37, 129–133 (2018)

    Google Scholar 

  17. Zhao, Y., Duan, Y.: A text classification model based on attention mechanism for convolutional neural networks. J. Appl. Sci. 37, 541–550 (2019)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by Major Research and Development Projects of Anhui Province Key R&D Program Project (201904e01020015); Qinghai Province Key R&D and Transformation Program (2020-QY-213); Qinghai Province Basic Research Program Project (2020-ZJ-913); National Key R&D Program Subproject (2017YFD0301303).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunzhi Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, H., He, Z., Zhang, W., Hu, Y., Wu, Y., Yue, Y. (2021). Multi-class Text Classification Model Based on Weighted Word Vector and BiLSTM-Attention Optimization. In: Huang, DS., Jo, KH., Li, J., Gribova, V., Bevilacqua, V. (eds) Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science(), vol 12836. Springer, Cham. https://doi.org/10.1007/978-3-030-84522-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84522-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84521-6

  • Online ISBN: 978-3-030-84522-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics