Skip to main content
Log in

Integrating information by Kullback–Leibler constraint for text classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Text classification is an important assignment for various text-related downstream assignments, such as fake news detection, sentiment analysis, and question answering. In recent years, the graph-based method achieves excellent results in text classification tasks. Instead of regarding a text as a sequence structure, this method regards it as a co-occurrence set of words. The task of text classification is then accomplished by aggregating the data from nearby nodes using the graph neural network. However, existing corpus-level graph models are difficult to incorporate the local semantic information and classify new coming texts. To address these issues, we propose a Global–Local Text Classification (GLTC) model, based on the KL constraints to realize inductive learning for text classification. Firstly, a global structural feature extractor and a local semantic feature extractor are designed to capture the structural and semantic information of text comprehensively. Then, the KL divergence is introduced as a regularization term in the loss calculation process, which ensures that the global structural feature extractor can constrain the learning of the local semantic feature extractor to achieve inductive learning. The comprehensive experiments on benchmark datasets present that GLTC outperforms baseline methods in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available in the Experiments section.

Notes

  1. https://nlp.stanford.edu/sentiment/code.html.

  2. http://www.cs.cornell.edu/people/pabo/movie-review-data/.

  3. http://disi.unitn.it/moschitti/corpora.htm.

  4. https://www.cs.umb.edu/smimarog/textmining/datasets/.

  5. http://nlp.stanford.edu/data/glove.6B.zip.

References

  1. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp 2267–2273

  2. Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp 625–631

  3. Yu D, Chen CLP, Xu H (2022) Fuzzy swarm control based on sliding-mode strategy with self-organized omnidirectional mobile robots system. IEEE Trans Syst, Man, Cybern: Syst 52(4):2262–2274

    Article  Google Scholar 

  4. Barushka A, Hajek P (2020) Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Comput Appl 32(9):4239–4257

    Article  Google Scholar 

  5. Zhu J, Wang C, Gao C, Zhang F, Wang Z, Li X (2022) Community detection in graph: an embedding method. IEEE Trans Netw Sci Eng 9(2):689–702

    Article  MathSciNet  Google Scholar 

  6. Chen L, Jiang L, Li C (2021) Using modified term frequency to improve term weighting for text classification. Eng Appl Artif Intell 101:104215

    Article  Google Scholar 

  7. Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 52:26–39

    Article  Google Scholar 

  8. Quinlan JR (1996) Learning decision tree classifiers. ACM Comput Surv 28(1):71–72

    Article  Google Scholar 

  9. Forman G (2008) BNS feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp 263–270

  10. Tan S (2006) An effective refinement strategy for KNN text classifier. Expert Syst Appl 30(2):290–298

    Article  Google Scholar 

  11. Zhang L, Jiang L, Li C (2016) A new feature selection approach to naive Bayes text classifiers. Int J Pattern Recognit Artif Intell 30(02):1650003

    Article  MathSciNet  Google Scholar 

  12. Wang S, Jiang L, Li C (2015) Adapting naive Bayes tree for text classification. Knowl Inf Syst 44:77–89

    Article  Google Scholar 

  13. Jiang L, Wang S, Li C, Zhang L (2016) Structure extended multinomial naive Bayes. Inf Sci 329:346–356

    Article  Google Scholar 

  14. Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv 54(3):1–40

    Article  Google Scholar 

  15. Lu G, Gan J, Yin J, Luo Z, Li B, Zhao X (2020) Multi-task learning using a hybrid representation for text classification. Neural Comput Appl 32(11):6467–6480

    Article  Google Scholar 

  16. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems 2:3111–3119

  17. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1532–1543

  18. Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lang Technol 1:4171–4186

  19. Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629

    Article  Google Scholar 

  20. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1724–1734

  21. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000–6010

  22. Malekzadeh M, Hajibabaee P, Heidari M, Zad S, Uzuner O, Jones JH (2021) Review of graph neural network in text classification. In: Proceedings of the 2021 IEEE 12th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, pp 0084–0091

  23. Li X, Wu X, Luo Z, Du Z, Wang Z, Gao C (2022) Integration of global and local information for text classification. Neural Comput Appl 35:1–16

    Google Scholar 

  24. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 33:7370–7377

  25. Huang L, Ma D, Li S, Zhang X, Wang H (2019) Text level graph neural network for text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 3444–3450

  26. Jiang L, Zhang L, Li C, Wu J (2018) A correlation-based feature weighting filter for naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213

    Article  Google Scholar 

  27. LaValley MP (2008) Logistic regression. Circulation 117(18):2395–2399

    Article  Google Scholar 

  28. Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted naive Bayes. Pattern Recogn 88:321–330

    Article  Google Scholar 

  29. Kowsari K, Jafari Meimandi K, Heidarysafa M, Mendu S, Barnes L, Brown D (2019) Text classification algorithms: a survey. Information 10(4):150

    Article  Google Scholar 

  30. Yong Z, Youwen L, Shixiong X (2009) An improved KNN text classification algorithm based on clustering. J Comput 4(3):230–237

    Google Scholar 

  31. Tan S (2005) Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Syst Appl 28(4):667–671

    Article  Google Scholar 

  32. Sahgal D, Parida M (2014) Object recognition using gabor wavelet features with various classification techniques. In: Proceedings of the Third International Conference on Soft Computing for Problem Solving: SocProS 2013(1):793–804

  33. Joachims T (2005) Text categorization with support vector machines: Learning with many relevant features. In: Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23, 1998 Proceedings, pp 137–142

  34. Joachims T et al (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the Sixteenth International Conference on Machine Learning 99:200–209

  35. Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol 13(2):41

    Article  Google Scholar 

  36. Su J, Zhang H (2006) A fast decision tree learning algorithm. AAAI Conf Artif Intell 6:500–505

    Google Scholar 

  37. Vateekul P, Kubat M (2009) Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data. In: 2009 IEEE International Conference on Data Mining Workshops, pp 320–325

  38. Bigi B (2003) Using Kullback–Leibler distance for text categorization. In: Advances in Information Retrieval: 25th European Conference on IR Research, pp 305–319

  39. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics 1:655–665

  40. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1746–1751

  41. Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp 2873–2879

  42. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 1:1556–1566

  43. Chang W-C, Yu H-F, Zhong K, Yang Y, Dhillon IS (2020) Taming pretrained transformers for extreme multi-label text classification. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 3163–3171

  44. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489

  45. Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 1:2321–2331

  46. Liu X, You X, Zhang X, Wu J, Lv P (2020) Tensor graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence 34:8409–8416

  47. Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp 4821–4830

  48. Fan M, Cheng D, Yang F, Luo S, Luo Y, Qian W, Zhou A (2020) Fusing global domain information and local semantic information to classify financial documents. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, pp 2413–2420

  49. Zhang Y, Yu X, Cui Z, Wu S, Wen Z, Wang L (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 334–339

  50. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp 1597–1607 (2020)

  51. Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297

  52. Gao T, Yao X, Chen D (2021) SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 6894–6910

  53. Fang H, Wang S, Zhou M, Ding J, Xie P (2020) Cert: Contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766

  54. Liang X, Wu L, Li J, Wang Y, Meng Q, Qin T, Chen W, Zhang M, Liu T (2021) R-drop: regularized dropout for neural networks. Adv Neural Inf Process Syst 34:10890–10905

    Google Scholar 

  55. Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702

  56. Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65

    Article  MATH  Google Scholar 

  57. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1631–1642

  58. Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 115–124

  59. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics 2:427–431

  60. Wu X, Luo Z, Du Z, Wang J, Gao C, Li X (2021) Tw-tgnn: Two windows graph-based model for text classification. In: 2021 International Joint Conference on Neural Networks, pp 1–8

  61. Song R, Giunchiglia F, Zhao K, Tian M, Xu H (2022) Graph topology enhancement for text classification. Appl Intell 52:1–14

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Key Program for International Science and Technology Cooperation Projects of China (No. 2022YFE0112300), the National Natural Science Foundation of China (Nos. 61976181, 62271411, 62073263), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2022JM-325), the Technological Innovation Team of Shaanxi Province (No. 2020TD-013), the Natural Science Foundation of Hebei Province (Grant No. U22B2036), the Fundamental Research Funds for the Central Universities (D5000230112), the Tencent Foundation and XPLORER PRIZE.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Gao.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, S., Zhu, P., Wu, X. et al. Integrating information by Kullback–Leibler constraint for text classification. Neural Comput & Applic 35, 17521–17535 (2023). https://doi.org/10.1007/s00521-023-08602-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08602-0

Keywords

Navigation