Integrating information by Kullback–Leibler constraint for text classification

Yin, Shu; Zhu, Peican; Wu, Xinyu; Huang, Jiajin; Li, Xianghua; Wang, Zhen; Gao, Chao

doi:10.1007/s00521-023-08602-0

Integrating information by Kullback–Leibler constraint for text classification

Original Article
Published: 05 May 2023

Volume 35, pages 17521–17535, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Shu Yin^1,2,
Peican Zhu²,
Xinyu Wu³,
Jiajin Huang⁴,
Xianghua Li²,
Zhen Wang² &
…
Chao Gao ORCID: orcid.org/0000-0002-5865-2285^2,3

527 Accesses
2 Altmetric
Explore all metrics

Abstract

Text classification is an important assignment for various text-related downstream assignments, such as fake news detection, sentiment analysis, and question answering. In recent years, the graph-based method achieves excellent results in text classification tasks. Instead of regarding a text as a sequence structure, this method regards it as a co-occurrence set of words. The task of text classification is then accomplished by aggregating the data from nearby nodes using the graph neural network. However, existing corpus-level graph models are difficult to incorporate the local semantic information and classify new coming texts. To address these issues, we propose a Global–Local Text Classification (GLTC) model, based on the KL constraints to realize inductive learning for text classification. Firstly, a global structural feature extractor and a local semantic feature extractor are designed to capture the structural and semantic information of text comprehensively. Then, the KL divergence is introduced as a regularization term in the loss calculation process, which ensures that the global structural feature extractor can constrain the learning of the local semantic feature extractor to achieve inductive learning. The comprehensive experiments on benchmark datasets present that GLTC outperforms baseline methods in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integration of global and local information for text classification

Article 28 August 2022

Exploring semantic awareness via graph representation for text classification

Article 05 May 2022

Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification

Article Open access 28 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during and/or analyzed during the current study are available in the Experiments section.

Notes

References

Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp 2267–2273
Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp 625–631
Yu D, Chen CLP, Xu H (2022) Fuzzy swarm control based on sliding-mode strategy with self-organized omnidirectional mobile robots system. IEEE Trans Syst, Man, Cybern: Syst 52(4):2262–2274
Article Google Scholar
Barushka A, Hajek P (2020) Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Comput Appl 32(9):4239–4257
Article Google Scholar
Zhu J, Wang C, Gao C, Zhang F, Wang Z, Li X (2022) Community detection in graph: an embedding method. IEEE Trans Netw Sci Eng 9(2):689–702
Article MathSciNet Google Scholar
Chen L, Jiang L, Li C (2021) Using modified term frequency to improve term weighting for text classification. Eng Appl Artif Intell 101:104215
Article Google Scholar
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Article Google Scholar
Quinlan JR (1996) Learning decision tree classifiers. ACM Comput Surv 28(1):71–72
Article Google Scholar
Forman G (2008) BNS feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp 263–270
Tan S (2006) An effective refinement strategy for KNN text classifier. Expert Syst Appl 30(2):290–298
Article Google Scholar
Zhang L, Jiang L, Li C (2016) A new feature selection approach to naive Bayes text classifiers. Int J Pattern Recognit Artif Intell 30(02):1650003
Article MathSciNet Google Scholar
Wang S, Jiang L, Li C (2015) Adapting naive Bayes tree for text classification. Knowl Inf Syst 44:77–89
Article Google Scholar
Jiang L, Wang S, Li C, Zhang L (2016) Structure extended multinomial naive Bayes. Inf Sci 329:346–356
Article Google Scholar
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv 54(3):1–40
Article Google Scholar
Lu G, Gan J, Yin J, Luo Z, Li B, Zhao X (2020) Multi-task learning using a hybrid representation for text classification. Neural Comput Appl 32(11):6467–6480
Article Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems 2:3111–3119
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1532–1543
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lang Technol 1:4171–4186
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629
Article Google Scholar
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1724–1734
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000–6010
Malekzadeh M, Hajibabaee P, Heidari M, Zad S, Uzuner O, Jones JH (2021) Review of graph neural network in text classification. In: Proceedings of the 2021 IEEE 12th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, pp 0084–0091
Li X, Wu X, Luo Z, Du Z, Wang Z, Gao C (2022) Integration of global and local information for text classification. Neural Comput Appl 35:1–16
Google Scholar
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 33:7370–7377
Huang L, Ma D, Li S, Zhang X, Wang H (2019) Text level graph neural network for text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 3444–3450
Jiang L, Zhang L, Li C, Wu J (2018) A correlation-based feature weighting filter for naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213
Article Google Scholar
LaValley MP (2008) Logistic regression. Circulation 117(18):2395–2399
Article Google Scholar
Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted naive Bayes. Pattern Recogn 88:321–330
Article Google Scholar
Kowsari K, Jafari Meimandi K, Heidarysafa M, Mendu S, Barnes L, Brown D (2019) Text classification algorithms: a survey. Information 10(4):150
Article Google Scholar
Yong Z, Youwen L, Shixiong X (2009) An improved KNN text classification algorithm based on clustering. J Comput 4(3):230–237
Google Scholar
Tan S (2005) Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Syst Appl 28(4):667–671
Article Google Scholar
Sahgal D, Parida M (2014) Object recognition using gabor wavelet features with various classification techniques. In: Proceedings of the Third International Conference on Soft Computing for Problem Solving: SocProS 2013(1):793–804
Joachims T (2005) Text categorization with support vector machines: Learning with many relevant features. In: Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23, 1998 Proceedings, pp 137–142
Joachims T et al (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the Sixteenth International Conference on Machine Learning 99:200–209
Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol 13(2):41
Article Google Scholar
Su J, Zhang H (2006) A fast decision tree learning algorithm. AAAI Conf Artif Intell 6:500–505
Google Scholar
Vateekul P, Kubat M (2009) Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data. In: 2009 IEEE International Conference on Data Mining Workshops, pp 320–325
Bigi B (2003) Using Kullback–Leibler distance for text categorization. In: Advances in Information Retrieval: 25th European Conference on IR Research, pp 305–319
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics 1:655–665
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1746–1751
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp 2873–2879
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 1:1556–1566
Chang W-C, Yu H-F, Zhong K, Yang Y, Dhillon IS (2020) Taming pretrained transformers for extreme multi-label text classification. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 3163–3171
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 1:2321–2331
Liu X, You X, Zhang X, Wu J, Lv P (2020) Tensor graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 34:8409–8416
Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 4821–4830
Fan M, Cheng D, Yang F, Luo S, Luo Y, Qian W, Zhou A (2020) Fusing global domain information and local semantic information to classify financial documents. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, pp 2413–2420
Zhang Y, Yu X, Cui Z, Wu S, Wen Z, Wang L (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 334–339
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp 1597–1607 (2020)
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297
Gao T, Yao X, Chen D (2021) SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 6894–6910
Fang H, Wang S, Zhou M, Ding J, Xie P (2020) Cert: Contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766
Liang X, Wu L, Li J, Wang Y, Meng Q, Qin T, Chen W, Zhang M, Liu T (2021) R-drop: regularized dropout for neural networks. Adv Neural Inf Process Syst 34:10890–10905
Google Scholar
Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702
Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65
Article MATH Google Scholar
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1631–1642
Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 115–124
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics 2:427–431
Wu X, Luo Z, Du Z, Wang J, Gao C, Li X (2021) Tw-tgnn: Two windows graph-based model for text classification. In: 2021 International Joint Conference on Neural Networks, pp 1–8
Song R, Giunchiglia F, Zhao K, Tian M, Xu H (2022) Graph topology enhancement for text classification. Appl Intell 52:1–14
Article Google Scholar

Download references

Acknowledgements

This research was supported by the Key Program for International Science and Technology Cooperation Projects of China (No. 2022YFE0112300), the National Natural Science Foundation of China (Nos. 61976181, 62271411, 62073263), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2022JM-325), the Technological Innovation Team of Shaanxi Province (No. 2020TD-013), the Natural Science Foundation of Hebei Province (Grant No. U22B2036), the Fundamental Research Funds for the Central Universities (D5000230112), the Tencent Foundation and XPLORER PRIZE.

Author information

Authors and Affiliations

The School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, Shannxi, China
Shu Yin
The School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi’an, 710072, Shannxi, China
Shu Yin, Peican Zhu, Xianghua Li, Zhen Wang & Chao Gao
The College of Computer and Information Science, Southwest University, Chongqing, 400715, China
Xinyu Wu & Chao Gao
Faculty of Information Technology, Beijing University of Technology, Beijing, 100083, China
Jiajin Huang

Authors

Shu Yin
View author publications
You can also search for this author in PubMed Google Scholar
Peican Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiajin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xianghua Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Gao.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yin, S., Zhu, P., Wu, X. et al. Integrating information by Kullback–Leibler constraint for text classification. Neural Comput & Applic 35, 17521–17535 (2023). https://doi.org/10.1007/s00521-023-08602-0

Download citation

Received: 20 October 2022
Accepted: 11 April 2023
Published: 05 May 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00521-023-08602-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating information by Kullback–Leibler constraint for text classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Integration of global and local information for text classification

Exploring semantic awareness via graph representation for text classification

Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Integrating information by Kullback–Leibler constraint for text classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Integration of global and local information for text classification

Exploring semantic awareness via graph representation for text classification

Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification

Explore related subjects

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation