Abstract
Chinese short text classification approaches based on lexicon information and pretrained language models have yielded state-of-the-art results. However, they simply use the pretrained language model as an embedding layer and fuse lexicon features while not fully utilizing the advantages of either. In this paper, we propose a new model, the bidirectional lattice graph attention network (BiLGAT). It enhances the representation of characters by aggregating the features of different hidden states of BERT. The lexicon features in the lattice graph are fused into character features with the powerful representation capability of the graph attention network, and the problem of word segmentation error propagation is solved at the same time. The experimental results on three Chinese short text classification datasets demonstrate the superior performance of this method. Among these datasets, 94.75% accuracy was achieved on THUCNEWS, 70.71% accuracy was achieved on TNEWS, and 86.49% accuracy was achieved on CNT.
Similar content being viewed by others
References
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Computing Surveys (CSUR) 54(3):1–40
Wang W, Zhou D (2018) A multi-level approach to highly efficient recognition of chinese spam short messages. Frontiers of Computer Science 12(1):135–145
Arif MH, Li J, Iqbal M, Liu K (2018) Sentiment analysis and spam detection in short informal text using learning classifier systems. Soft Computing 22(21):7281–7291
Song C, Wang X-K, Cheng P-f, Wang J-q, Li L (2020) Sacpc: A framework based on probabilistic linguistic terms for short text sentiment analysis. Knowledge-Based Systems 194:105572
Li M, Chen L, Zhao J, Li Q (2021) Sentiment analysis of chinese stock reviews based on bert model. Applied Intelligence 51(7):5016–5024
Rao G, Huang W, Feng Z, Cong Q (2018) Lstm with sentence representations for document-level sentiment classification. Neurocomputing 308:49–57
Wang H, Zhang F, Xie X, Guo M (2018) Dkn: Deep knowledge-aware network for news recommendation. In: Proceedings of the 2018 world wide web conference. pp 1835–1844
Chen L, Zhang H, Jose JM, Yu H, Moshfeghi Y, Triantafillou P (2018) Topic detection and tracking on heterogeneous information. Journal of Intelligent Information Systems 51(1):115–137
Guo B, Zhang C, Liu J, Ma X (2019) Improving text classification with weighted word embeddings via a multi-channel textcnn model. Neurocomputing 363:366–374
Zhou Y, Xu B, Xu J, Yang L, Li C (2016) Compositional recurrent neural networks for chinese short text classification. In: 2016 IEEE/WIC/ACM international conference on web intelligence (WI). pp 137–144
Chen J, Hu Y, Liu J, Xiao Y, Jiang H (2019) Deep short text classification with knowledge powered attention. Proceedings of the AAAI Conference on Artificial Intelligence 33:6252–6259
Shaheen Z, Wohlgenannt G, Filtz E (2020) Large scale legal text classification using transformer models. arXiv preprint arXiv:2010.12871
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186
Sun Y, Wang S, Li Y, Feng S, Chen X, Zhang H, Tian X, Zhu D, Tian H, Wu H (2019) Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223
Zhang X, Li P, Li H (2021) AMBERT: A pre-trained language model with multi-grained tokenization. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021:421–435
Wang Z, Huang Z, Gao J (2020) Chinese text classification method based on bert word embedding. In: Proceedings of the 2020 5th international conference on mathematics and artificial intelligence. ICMAI 2020. Association for Computing Machinery, New York, NY, USA, pp 66–71
Guo H, Liu T, Liu F, Li Y, Hu W (2021) Chinese text classification model based on bert and capsule network structure. In: 2021 7th IEEE intl conference on big data security on cloud (BigDataSecurity), IEEE intl conference on high performance and smart computing, (HPSC) and IEEE intl conference on intelligent data and security (IDS). pp 105–110
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence 33:7370–7377
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence
Lai Y, Liu Y, Feng Y, Huang S, Zhao D (2021) Lattice-bert: Leveraging multi-granularity representations in chinese pre-trained language models. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: Human language technologies. pp 1716–1731
Wang Z, Huang Z, Gao J (2020) Chinese text classification method based on bert word embedding. In: Proceedings of the 2020 5th international conference on mathematics and artificial intelligence, ICMAI 2020. Association for Computing Machinery, New York, NY, USA, pp 66–71
Cui Y, Huang C (2021) A chinese text classification method based on bert and convolutional neural network. In: 2021 7th International Conference on Systems and Informatics (ICSAI). pp 1–6
Liu W, Fu X, Zhang Y, Xiao W (2021) Lexicon enhanced chinese sequence labeling using bert adapter. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). pp 5847–5858
Jawahar G, Sagot B, Seddah D (2019) What does BERT learn about the structure of language? In: Proceedings of the 57th annual meeting of the association for computational linguistics. pp 3651–3657
Yang T, Hu L, Shi C, Ji H, Li X, Nie L (2021) Hgat: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Transactions on Information Systems (TOIS) 39(3):1–29
Wang Y, Wang S, Yao Q, Dou D (2021) Hierarchical heterogeneous graph representation learning for short text classification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp 3091–3101
Yang S, Liu Y (2021) A character-word graph attention networks for chinese text classification. In: 2021 IEEE International Conference on Big Knowledge (ICBK). pp 462–469
Xu L, Hu H, Zhang X, Li L, Cao C, Li Y, Xu Y, Sun K, Yu D, Yu C, Tian Y, Dong Q, Liu W, Shi B, Cui Y, Li J, Zeng J, Wang R, Xie W, Li Y, Patterson Y, Tian Z, Zhang Y, Zhou H, Liu S, Zhao Z, Zhao Q, Yue C, Zhang X, Yang Z, Richardson K, Lan Z (2020) CLUE: A Chinese language understanding evaluation benchmark. In: Proceedings of the 28th international conference on computational linguistics
Sun M, Li J, Guo Z, Yu Z, Zheng Y, Si X, Liu Z (2016) Thuctc: an efficient chinese text classifier. GitHub Repository
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar pp 1746–1751
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short Papers). pp 207–212
Johnson R, Zhang T (2017) Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers). pp 562–570
Tao H, Tong S, Zhao H, Xu T, Jin B, Liu Q (2019) A radical-aware attention-based model for chinese text classification. Proceedings of the AAAI Conference on Artificial Intelligence 33:5125–5132
Sun Z, Li X, Sun X, Meng Y, Ao X, He Q, Wu F, Li J (2021) Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). pp 2065–2075
Tao H, Tong S, Zhang K, Xu T, Liu Q, Chen E, Hou M (2021) Ideography leads us to the field of cognition: A radical-guided associative model for chinese text classification. Proceedings of the AAAI Conference on Artificial Intelligence 35:13898–13906
Acknowledgements
This research was partially funded by the National Natural Science Foundation of China (NSFC), No. 61832014 and 61373165. The authors thank anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Penghao Lyu and Qing Cong contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lyu, P., Rao, G., Zhang, L. et al. BiLGAT: Bidirectional lattice graph attention network for chinese short text classification. Appl Intell 53, 22405–22414 (2023). https://doi.org/10.1007/s10489-023-04700-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04700-7