Skip to main content
Log in

Text classification on heterogeneous information network via enhanced GCN and knowledge

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Graph convolutional networks-based text classification methods have shown impressive success in further improving the classification results by considering the structural relationship between words and texts. However, existing GCN-based text classification methods tend to ignore the semantic representation of the node and the global structural information among nodes. Besides, only the word granularity information within the text, i.e., endogenous source, is used to represent the text. Furthermore, the existing graph convolutional network approaches are faced with major challenges to handle large and dense graphs, i.e., neighbor explosion and noisy inputs. To address these shortcomings, this paper proposes an inductive learning-based text classification method that utilizes representation learning on heterogeneous information networks and exogenous knowledge. Firstly, a weighted heterogeneous information network for text (HINT) is constructed by introducing exogenous knowledge, in which the node types cover text, entities and words. The unstructured text is represented as a structured heterogeneous information network, which expands the granularity of text features and makes full use of the exogenous structural information and explicit semantic information to enhance the interpretability of text information. Besides, we also enhanced the graph neural network against the challenges of neighbor explosion and noisy inputs derived from HINT using two strategies: graph sampling and Dropedge, for semi-supervised learning with improved classification performance. The effectiveness of our model is demonstrated by examining four publicly available text classification datasets. Based on experimental results, our approach achieves state-of-the-art performance on the text classification datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The R8 and R52 data that support the findings of this study are available in/from the Reuters-21,578 corpus, http://www.daviddlewis.co. The Ohsumed data that support the findings of this study are available in/from the Text Categorization corpora, http://disi.unitn.it/moschitti/corpora.htm. The TREC data that support the findings of this study are available in/from the TERC corpus, https://trec.nist.gov/data.html. The 20NG data that support the findings of this study are available in/from the 20Newsgroups, https://trec.nist.gov/data.html. The MR data that support the findings of this study are available in/from the movie review corpus, https://www.cs.cornell.edu/people/pabo/movie-review-data/.

Notes

  1. http://www.daviddlewis.com/.

  2. http://www.daviddlewis.com/.

  3. https://trec.nist.gov/data.html.

  4. http://qwone.com/~jason/20Newsgroups/.

  5. http://www.cs.cornell.edu/people/pabo/movie-review-data/.

References

  1. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. Proc AAAI Conf Artif Intell 33:7370–7377

    Google Scholar 

  2. Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J (2019) Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & data mining, pp 257–266

  3. Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) Graphsaint: graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931

  4. Luo D, Cheng W, Yu W, Zong B, Ni J, Chen H, Zhang X (2021) Learning to drop: robust graph neural network via topological denoising. In: Proceedings of the 14th ACM International conference on web search and data mining, pp 779–787

  5. Yamada I, Shindo H (2019) Neural attentive bag-of-entities model for text classification. arXiv preprint arXiv:1909.01259

  6. Hasibi F, Balog K, Bratsberg SE (2016) Exploiting entity linking in queries for entity retrieval. In: Proceedings of the 2016 Acm International conference on the theory of information retrieval, pp 209–218

  7. Xiong C, Callan J, Liu T-Y (2016) Bag-of-entities representation for ranking. In: Proceedings of the 2016 ACM International conference on the theory of information retrieval, pp 181–184

  8. Guo S, Chang M-W, Kiciman E (2013) To link or not to link? a study on end-to-end tweet entity linking. In: Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1020–1030

  9. Mihalcea R, Csomai A (2007) Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, pp 233–242

  10. Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM conference on information and knowledge management, pp 509–518

  11. Yamada I, Shindo H, Takeda H, Takefuji Y (2016) Joint learning of the embedding of words and entities for named entity disambiguation. arXiv preprint arXiv:1601.01343

  12. Yamada I, Shindo H, Takefuji Y (2018) Representation learning of entities and documents from knowledge base descriptions. arXiv preprint arXiv:1806.02960

  13. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  14. Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, pp 722–735. Springer

  15. Haveliwala TH (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796

    Article  Google Scholar 

  16. Yeh E, Ramage D, Manning CD, Agirre E, Soroa A (2009) Wikiwalk: random walks on wikipedia for semantic relatedness. In: Proceedings of the 2009 workshop on graph-based methods for natural language processing (TextGraphs-4), pp 41–49

  17. Hong D, Gao L, Yao J, Zhang B, Plaza A, Chanussot J (2020) Graph convolutional networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens

  18. Rong Y, Huang W, Xu T, Huang J (2019) Dropedge: towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903

  19. Hersh W, Buckley C, Leone T, Hickam D (1994) Ohsumed: an interactive retrieval evaluation and new large test collection for research. In: SIGIR’94, pp 192–201. Springer

  20. Jin P, Zhang Y, Chen X, Xia Y (2016) Bag-of-embeddings for text classification. IJCAI 16:2824–2830

    Google Scholar 

  21. Kim Y (2014) Convolutional neural networks for sentence classification

  22. Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101

  23. Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338

    Article  Google Scholar 

  24. Tang J, Qu M, Mei Q (2015) Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1165–1174

  25. Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759

  26. Shen D, Wang G, Wang W, Min M.R, Su Q, Zhang Y, Li C, Henao R, Carin L (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. arXiv preprint arXiv:1805.09843

  27. Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174

  28. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29:3844–3852

    Google Scholar 

  29. Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203

  30. Henaff M, Bruna J, LeCun Y (2015) Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163

  31. Paccanaro A, Hinton GE (2001) Learning distributed representations of concepts using linear relational embedding. IEEE Trans Knowl Data Eng 13(2):232–244

    Article  Google Scholar 

  32. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  33. Peters M.E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365

  34. Yin W, Schütze H (2016) Multichannel variable-size convolution for sentence classification. arXiv preprint arXiv:1603.04513

  35. Wang S, Huang M, Deng Z, et al. (2018) Densely connected cnn with multi-scale feature attention for text classification. In: IJCAI,pp. 4468–4474

  36. Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781

  37. Liu Y, Ji L, Huang R, Ming T, Gao C, Zhang J (2019) An attention-gated convolutional neural network for sentence classification. Intell Data Anal 23(5):1091–1107

    Article  Google Scholar 

  38. Zhao W, Zhu L, Wang M, Zhang X, Zhang J (2022) Wtl-cnn: a news text classification method of convolutional neural network based on weighted word embedding. Connect Sci 34(1):2291–2312

    Article  Google Scholar 

  39. Tai K.S, Socher R, Manning C.D (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075

  40. Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 2: Short Papers, pp 207–212

  41. Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B (2016) Text classification improved by integrating bidirectional lstm with two-dimensional max pooling. arXiv preprint arXiv:1611.06639

  42. Xia W, Zhu W, Liao B, Chen M, Cai L, Huang L (2018) Novel architecture for long short-term memory used in question classification. Neurocomputing 299:20–31

    Article  Google Scholar 

  43. Ding Z, Xia R, Yu J, Li X, Yang J (2018) Densely connected bidirectional lSTM with applications to sentence classification. In: CCF International conference on natural language processing and chinese computing, pp 278–287. Springer

  44. Zhao Y, Shen Y, Yao J (2019) Recurrent neural network for text classification with hierarchical multiscale dense connections. In: IJCAI, pp 5450–5456

  45. Li W, Qi F, Tang M, Yu Z (2020) Bidirectional lstm with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387:63–77

    Article  Google Scholar 

  46. Wang R, Li Z, Cao J, Chen T, Wang L (2019) Convolutional recurrent neural networks for text classification. In: 2019 International joint conference on neural networks (IJCNN), pp 1–6 IEEE

  47. Zhou C, Sun C, Liu Z, Lau F (2015) A c-lstm neural network for text classification. arXiv preprint arXiv:1511.08630

  48. Devlin J, Chang M.-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  49. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  50. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Proces Syst 32

  51. Pan L, Hang C-W, Sil A, Potdar S (2022) Improved text classification via contrastive adversarial training. Proc AAAI Conf Artif Intell 36:11130–11138

    Google Scholar 

  52. Jiang T, Wang D, Sun L, Yang H, Zhao Z, Zhuang F (2021) Lightxml: transformer with dynamic negative sampling for high-performance extreme multi-label text classification. Proc AAAI Conf Artif Intell 35:7987–7994

    Google Scholar 

  53. Hamilton W.L, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st International conference on neural information processing systems, pp 1025–1035

  54. Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826

  55. Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24

    Article  MathSciNet  Google Scholar 

  56. Kipf T.N, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  57. Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning, pp 6861–6871. PMLR

  58. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

  59. Zhang J, Shi X, Xie J, Ma H, King I, Yeung D-Y (2018) Gaan: gated attention networks for learning on large and spatiotemporal graphs. arXiv preprint arXiv:1803.07294

  60. Cao S, Lu W, Xu Q (2016) Deep neural networks for learning graph representations. In: Proceedings of the AAAI conference on artificial intelligence, vol. 30

  61. Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308

  62. Zhang Y, Qi P, Manning CD (2018) Graph convolution over pruned dependency trees improves relation extraction. arXiv preprint arXiv:1809.10185

  63. Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. arXiv preprint arXiv:1703.04826

  64. Marcheggiani D, Perez-Beltrachini L (2018) Deep graph convolutional encoders for structured data to text generation. arXiv preprint arXiv:1810.09995

  65. Bastings J, Titov I, Aziz W, Marcheggiani D, Sima’an K (2017) Graph convolutional encoders for syntax-aware neural machine translation. arXiv preprint arXiv:1704.04675

  66. De Cao N, Aziz W, Titov I (2018) Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920

  67. Chai D, Wu W, Han Q, Wu F, Li J (2020) Description based text classification with reinforcement learning. In: International conference on machine learning, pp 1371–1382. PMLR

  68. Pappas N, Henderson J (2019) Gile: a generalized input-label embedding for text classification. Trans Assoc Comput Linguist 7:139–155

    Article  Google Scholar 

  69. Chen J, Ma T, Xiao C (2018) Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247

  70. Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International conference on knowledge discovery & data mining, pp 974–983

  71. Chen J, Zhu J, Song L (2017) Stochastic training of graph convolutional networks with variance reduction. arXiv preprint arXiv:1710.10568

  72. Gao H, Wang Z, Ji S (2018) Large-scale learnable graph convolutional networks. In: Proceedings of the 24th ACM SIGKDD International conference on knowledge discovery & data mining, pp 1416–1424

  73. Wang K, Han SC, Poon J (2022) Induct-gcn: Inductive graph convolutional networks for text classification. arXiv preprint arXiv:2206.00265

  74. Wang K, Han SC, Long S, Poon J (2022) Me-gcn: Multi-dimensional edge-embedded graph convolutional networks for semi-supervised text classification. arXiv preprint arXiv:2204.04618

  75. Han SC, Yuan Z, Wang K, Long S, Poon J (2022) Understanding graph convolutional networks for text classification. arXiv preprint arXiv:2203.16060

  76. Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol TIST 13(2):1–41

    Google Scholar 

Download references

Funding

This research was funded partially by Innovation Program of Chinese Academy of Agricultural Sciences (Grant No. CAAS-ASTIP-2021-AII-06), Central Public-interest Scientific Institution Basal Research Fund (Grant No. key laboratory open subject of Agricultural Information Research Institute 22), Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing, China, Digital agriculture technology system Beijing innovation team (Grant No. BAIC10-2022-E10).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yan Yan, Shuo Wang or Juan Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Yan, Y., Wang, S. et al. Text classification on heterogeneous information network via enhanced GCN and knowledge. Neural Comput & Applic 35, 14911–14927 (2023). https://doi.org/10.1007/s00521-023-08494-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08494-0

Keywords

Navigation