Abstract
Text classification is an important assignment for various text-related downstream assignments, such as fake news detection, sentiment analysis, and question answering. In recent years, the graph-based method achieves excellent results in text classification tasks. Instead of regarding a text as a sequence structure, this method regards it as a co-occurrence set of words. The task of text classification is then accomplished by aggregating the data from nearby nodes using the graph neural network. However, existing corpus-level graph models are difficult to incorporate the local semantic information and classify new coming texts. To address these issues, we propose a Global–Local Text Classification (GLTC) model, based on the KL constraints to realize inductive learning for text classification. Firstly, a global structural feature extractor and a local semantic feature extractor are designed to capture the structural and semantic information of text comprehensively. Then, the KL divergence is introduced as a regularization term in the loss calculation process, which ensures that the global structural feature extractor can constrain the learning of the local semantic feature extractor to achieve inductive learning. The comprehensive experiments on benchmark datasets present that GLTC outperforms baseline methods in terms of accuracy.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analyzed during the current study are available in the Experiments section.
References
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp 2267–2273
Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp 625–631
Yu D, Chen CLP, Xu H (2022) Fuzzy swarm control based on sliding-mode strategy with self-organized omnidirectional mobile robots system. IEEE Trans Syst, Man, Cybern: Syst 52(4):2262–2274
Barushka A, Hajek P (2020) Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Comput Appl 32(9):4239–4257
Zhu J, Wang C, Gao C, Zhang F, Wang Z, Li X (2022) Community detection in graph: an embedding method. IEEE Trans Netw Sci Eng 9(2):689–702
Chen L, Jiang L, Li C (2021) Using modified term frequency to improve term weighting for text classification. Eng Appl Artif Intell 101:104215
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Quinlan JR (1996) Learning decision tree classifiers. ACM Comput Surv 28(1):71–72
Forman G (2008) BNS feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp 263–270
Tan S (2006) An effective refinement strategy for KNN text classifier. Expert Syst Appl 30(2):290–298
Zhang L, Jiang L, Li C (2016) A new feature selection approach to naive Bayes text classifiers. Int J Pattern Recognit Artif Intell 30(02):1650003
Wang S, Jiang L, Li C (2015) Adapting naive Bayes tree for text classification. Knowl Inf Syst 44:77–89
Jiang L, Wang S, Li C, Zhang L (2016) Structure extended multinomial naive Bayes. Inf Sci 329:346–356
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv 54(3):1–40
Lu G, Gan J, Yin J, Luo Z, Li B, Zhao X (2020) Multi-task learning using a hybrid representation for text classification. Neural Comput Appl 32(11):6467–6480
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems 2:3111–3119
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1532–1543
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lang Technol 1:4171–4186
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1724–1734
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000–6010
Malekzadeh M, Hajibabaee P, Heidari M, Zad S, Uzuner O, Jones JH (2021) Review of graph neural network in text classification. In: Proceedings of the 2021 IEEE 12th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, pp 0084–0091
Li X, Wu X, Luo Z, Du Z, Wang Z, Gao C (2022) Integration of global and local information for text classification. Neural Comput Appl 35:1–16
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 33:7370–7377
Huang L, Ma D, Li S, Zhang X, Wang H (2019) Text level graph neural network for text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 3444–3450
Jiang L, Zhang L, Li C, Wu J (2018) A correlation-based feature weighting filter for naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213
LaValley MP (2008) Logistic regression. Circulation 117(18):2395–2399
Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted naive Bayes. Pattern Recogn 88:321–330
Kowsari K, Jafari Meimandi K, Heidarysafa M, Mendu S, Barnes L, Brown D (2019) Text classification algorithms: a survey. Information 10(4):150
Yong Z, Youwen L, Shixiong X (2009) An improved KNN text classification algorithm based on clustering. J Comput 4(3):230–237
Tan S (2005) Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Syst Appl 28(4):667–671
Sahgal D, Parida M (2014) Object recognition using gabor wavelet features with various classification techniques. In: Proceedings of the Third International Conference on Soft Computing for Problem Solving: SocProS 2013(1):793–804
Joachims T (2005) Text categorization with support vector machines: Learning with many relevant features. In: Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23, 1998 Proceedings, pp 137–142
Joachims T et al (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the Sixteenth International Conference on Machine Learning 99:200–209
Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol 13(2):41
Su J, Zhang H (2006) A fast decision tree learning algorithm. AAAI Conf Artif Intell 6:500–505
Vateekul P, Kubat M (2009) Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data. In: 2009 IEEE International Conference on Data Mining Workshops, pp 320–325
Bigi B (2003) Using Kullback–Leibler distance for text categorization. In: Advances in Information Retrieval: 25th European Conference on IR Research, pp 305–319
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics 1:655–665
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1746–1751
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp 2873–2879
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 1:1556–1566
Chang W-C, Yu H-F, Zhong K, Yang Y, Dhillon IS (2020) Taming pretrained transformers for extreme multi-label text classification. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 3163–3171
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 1:2321–2331
Liu X, You X, Zhang X, Wu J, Lv P (2020) Tensor graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 34:8409–8416
Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 4821–4830
Fan M, Cheng D, Yang F, Luo S, Luo Y, Qian W, Zhou A (2020) Fusing global domain information and local semantic information to classify financial documents. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, pp 2413–2420
Zhang Y, Yu X, Cui Z, Wu S, Wen Z, Wang L (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 334–339
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp 1597–1607 (2020)
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297
Gao T, Yao X, Chen D (2021) SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 6894–6910
Fang H, Wang S, Zhou M, Ding J, Xie P (2020) Cert: Contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766
Liang X, Wu L, Li J, Wang Y, Meng Q, Qin T, Chen W, Zhang M, Liu T (2021) R-drop: regularized dropout for neural networks. Adv Neural Inf Process Syst 34:10890–10905
Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702
Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1631–1642
Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 115–124
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics 2:427–431
Wu X, Luo Z, Du Z, Wang J, Gao C, Li X (2021) Tw-tgnn: Two windows graph-based model for text classification. In: 2021 International Joint Conference on Neural Networks, pp 1–8
Song R, Giunchiglia F, Zhao K, Tian M, Xu H (2022) Graph topology enhancement for text classification. Appl Intell 52:1–14
Acknowledgements
This research was supported by the Key Program for International Science and Technology Cooperation Projects of China (No. 2022YFE0112300), the National Natural Science Foundation of China (Nos. 61976181, 62271411, 62073263), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2022JM-325), the Technological Innovation Team of Shaanxi Province (No. 2020TD-013), the Natural Science Foundation of Hebei Province (Grant No. U22B2036), the Fundamental Research Funds for the Central Universities (D5000230112), the Tencent Foundation and XPLORER PRIZE.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yin, S., Zhu, P., Wu, X. et al. Integrating information by Kullback–Leibler constraint for text classification. Neural Comput & Applic 35, 17521–17535 (2023). https://doi.org/10.1007/s00521-023-08602-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08602-0