ABSTRACT
Although the Word2Vec model can solve the problem of sparse features and high dimensionality in text representation, it cannot handle the problem of multiple meaning words in Chinese vocabulary. Therefore, this paper proposes a text classification model (EBGM) based on a combination of ERNIE, bidirectional gated recurrent unit (BiGRU) and maximum pool processing. First, more contextual semantic representation of Chinese text is performed by ERNIE pre-training model; then, to enhance the relevance of contextual semantics, contextual semantic information is extracted by BiGRU; finally, maximum pooling is performed to obtain important information of the text. The final results on the experimental dataset show that the model has good performance in the Chinese news headline classification task, which proves the feasibility of the model.
- Minaee S, Kalchbrenner N, Cambria E, Deep Learning Based Text Classification: A Comprehensive Review [J]. 2020.Google Scholar
- Kim Y. Convolutional Neural Networks for Sentence Classification [J]. Eprint Arxiv, 2014: 35-44.Google Scholar
- LU Ling, YANG Wu, WANG Yuanlun, LEI Zijian, LI Ying. Long text classification combined with attention mechanism [J]. Computer applications, 2018, 38 (05): 1272-1277.Google Scholar
- Liu Z, You F. TextGCN based on data enhancement [C]// EITCE 2020: 2020 4th International Conference on Electronic Information Technology and Computer Engineering. 2020.Google Scholar
- Hochreiter S, Schmidhuber J. Long Short-Term Memory [J]. Neural Computation, 1997, 9(8): 1735-1780.Google ScholarDigital Library
- CHEN Kejia, LIU Hui.Chinese text classification method based on improved BiGRU-CNN [J]. Computer engineering, 2022, 48 (05): 59-66+73.Google Scholar
- DUAN Dandan, TANG Jiashan, WEN Yong, Chinese short text classification algorithm based on BERT model [J]. Computer Engineering, 2021, 47(01): 79-86.Google Scholar
- Li Tianhao, Huo Qilun, Yan Yue, Xu Yuanchao. A Chinese relation extraction model integrating ERNIE and attention mechanism [J]. Small Microcomputer System, 2022, 43 (06): 1226-1231. DOI: 10.20009/j.cnki.21-1106/TP.2021-0918Google Scholar
- Devlin J, Chang M W, Lee K, BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding [C]//Proceedings of NAACL-HIL. 2019: 4171-4186.Google Scholar
- Vaswani A, Shazeer N, Parmar N, Attention is All You Need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.Google Scholar
- Sun Y, Wang S, Li Y, ERNIE: Enhanced Representation through Knowledge Integration [J]. 2019.Google Scholar
- Cho K, Merrienboer B V, Gulcehre C, Learning phrase representations using RNN Encoder-Decoder for statistical machine translation [J]. Association for Computational Linguistics, 2014, 14(1): 1724-1734.Google Scholar
- LAI Siwei, XU Liheng, LIU Kang, Recurrent convolutional neural networks for text classification [C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. [S.L.]: AAAI, 2015: 2267-2273.Google Scholar
- Collobert R, Weston J, Bottou L, Natural Language Processing (almost) from Scratch [J]. Journal of Machine Learning Research, 2011, 12(1): 2493-2537.Google ScholarDigital Library
Recommendations
Chinese News Title Classification Model Based on ERNIE-TextRCNN
MLNLP '22: Proceedings of the 2022 5th International Conference on Machine Learning and Natural Language ProcessingWith the rapid development of the Internet, the number of online news are increasing significantly. The efficient identification and classification of news through news headlines is of great significance for news regulation and information gathering. ...
Chinese News Text Multi Classification Based on Naive Bayes Algorithm
ISCSIC '18: Proceedings of the 2nd International Symposium on Computer Science and Intelligent ControlWith the development of Internet, there are more and more text data appear, the companies face the challenge to organize the content and the users feel confused about what is useful content for them. If the text data can be classified will make a ...
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values
Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Comments