Skip to main content
Log in

Graph topology enhancement for text classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Graph neural networks (GNNs) can deal with complex network structures and model complex syntax structures in natural languages, which makes GNN outstanding in text classification tasks. However, most graph neural network approaches don’t seem to take full advantage of the topological gains from document graphs. In this paper, we propose a topologically enhanced text classification method to make full use of the structural features of corpus graph and sentence graph. Specifically, we construct two different graphs based on contextual information, called sentence graphs and corpus graphs, respectively. We extract the topological features of words from the corpus graph and inject them into the graph neural network model based on sentence graph classification. To better integrate the topological features, we propose an asynchronous weighted propagation scheme, which selectively fuses the topological features with the original features of the word nodes, and integrate document features to predict final results. A large number of experiments on eight datasets demonstrate the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://www.cs.cornell.edu/people/pabo/movie-review-data/

  2. http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/

  3. http://disi.unitn.it/moschitti/corpora.htm

  4. https://www.cs.umb.edu/~smimarog/textmining/datasets/

References

  1. Aggarwal CC, Zhai C (2012) A survey of text classification algorithms. Mining Text Data:163–222

  2. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate international conference on learning representations

  3. Cai H, Zheng WV, Chang CCK (2018) A comprehensive survey of graph embedding: Problems, techniques and applications. IEEE Trans Knowl Data Eng 30:1616–1637

  4. Chiu PCJ, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics:357–370

  5. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding north american chapter of the association for computational linguistics

  6. Dyer C, Ballesteros M, Ling W, Matthews A, Smith AN (2015) Transition-based dependency parsing with stack long short-term memory. International workshop on the ACL2 theorem prover and its applications

  7. Felbo B, Mislove A, Søgaard A, Rahwan I, Lehmann S (2017) Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. EMNLP, pp 1616–1626

  8. Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst 151:78–94

  9. Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. KDD, pp 855–864

  10. Huang L, Ma D, Li S, Zhang X, Wang H (2019) Text level graph neural network for text classification. EMNLP/IJCNLP (1)3442–3448

  11. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification conference of the european chapter of the association for computational linguistics

  12. Kim Y (2014) Convolutional neural networks for sentence classification. EMNLP, pp 1746–1751

  13. Kingma PD, Ba LJ (2015) Adam: a method for stochastic optimization international conference on learning representations

  14. Kipf NT, Welling M (2016) Variational graph auto-encoders coRR

  15. Kipf NT, Welling M (2017) Semi-supervised classification with graph convolutional networks international conference on learning representations

  16. Li W, Xu H (2014) Text-based emotion classification using emotion cause extraction. Expert Syst Appl 41(4):1742– 1749

    Article  Google Scholar 

  17. Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neural networks international conference on learning representations

  18. Linmei H, Yang T, Shi C, Ji H, Li X (2019) Heterogeneous graph attention networks for semi-supervised short text classification. EMNLP/IJCNLP (1)4820–4829

  19. Liu X, Luo Z, Huang H (2018) Jointly multiple events extraction via attention-based graph information aggregation. EMNLP, pp 1247–1256

  20. Liu X, You X, Zhang X, Wu J, Lv P (2020) Tensor graph convolutional networks for text classification national conference on artificial intelligence

  21. Maaten VDL, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res

  22. Marcheggiani D, Bastings J, Titov I (2018) Exploiting semantics in neural machine translation with graph convolutional networks. In: North american chapter of the association for computational linguistics, pp 486–492

  23. Niepert M, Ahmed M, Kutzkov K (2012) Learning convolutional neural networks for graphs. ICML, pp 2014–2023

  24. Peng H, Li J, He Y, Liu Y, Bao M, Wang L, Song Y, Yang Q (2018) Large-scale hierarchical text classification with recursively regularized deep graph-cnn. In: Proceedings of the 2018 world wide web conference, pp 1063–1072

  25. Pennington J, Socher R, Manning DC (2014) Glove: Global vectors for word representation. EMNLP, pp 1532–1543

  26. Perozzi B, Al-Rfou’ R, Skiena S (2014) Deepwalk: online learning of social representations. KDD, pp 701–710

  27. Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text and web with hidden topics from large-scale data collections. WWW, pp 91–100

  28. Ribeiro FRL, Saverese HPP, Figueiredo RD (2017) struc2vec: Learning node representations from structural identity. In: KDD ’17: The 23rd ACM SIGKDD International conference on knowledge discovery and data mining halifax NS Canada August 2017, pp 385–394

  29. Rios A, Kavuluru R (2015) Convolutional neural networks for biomedical text classification: application in indexing biomedical articles. In: Proceedings of the 6th ACM Conference on bioinformatics, computational biology and health informatics, pp 258–267

  30. Roy A, Idan B, Yoram L (2019) Topological based classification using graph convolutional networks

  31. Ryabinin M, Popov S, Prokhorenkova L, Voita E (2020) Embedding words in non vector space with unsupervised graph learning. EMNLP 2020, pp 7317–7331

  32. S H, J S (1997) Long short-term memory. Neural Comput 9:1735–1780

  33. Scarselli F, Gori M, Tsoi CA, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Trans Neural Netw 20:61–80

  34. Shen D, Wang G, Wang W, Min RM, Su Q, Zhang Y, Li C, Henao R, Carin L (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms meeting of the association for computational linguistics

  35. Siddhant G, Goutham R (2020) Bae: Bert based adversarial examples for text classification. EMNLP 2020, pp 6174–6181

  36. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. Assoc Comput Linguist:1631–1642

  37. Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune bert for text classification?. CCL, pp 194–206

  38. Tang J, Qu M, Mei Q (2015) Pte: Predictive text embedding through large-scale heterogeneous text networks. ACM Knowledge Discovery and Data Mining

  39. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. WWW

  40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez NA, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. 30 (NIPS 2017), pp 5998–6008

  41. Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. ICLR

  42. Wallach MH (2006) Topic modeling: beyond bag-of-words. ICML, pp 977–984

  43. Wang S, Manning DC (2012) Baselines and bigrams: simple, good sentiment and topic classification. ACL

  44. Wei L, Shuheng L, Shuming M, Yancheng H, Deli C, Xu S (2019) Recursive graphical neural networks for text classification

  45. Wei P (2020) Short text classification via term graph

  46. Yang Z, Yang D, Dyer C, He X, Smola JA, Hovy HE (2016) Hierarchical attention networks for document classification. HLT-NAACL, pp 1480–1489

  47. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification national conference on artificial intelligence

  48. Yufeng Z, Xueli Y, Zeyu C, Shu W, Zhongzhen W, Liang W (2020) Every document owns its structure: Inductive text classification via graph neural networks. ACL, pp 334– 339

  49. Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Annual Conference on Neural Information Processing Systems, pp 649–657

  50. Zhang Y, Liu Q, Song L (2018) Sentence-state lstm for text representation meeting of the association for computational linguistics

  51. Zhibin L, Pan D, Jian-Yun N (2020) Vgcn-bert: Augmenting bert with graph embedding for text classification. European Conference on Information Retrieval, pp 369–382

  52. Zhou H, Ma Y, Li X (2021) Feature selection based on term frequency deviation rate for text classification. Appl Intell 6:3255–3274

    Article  Google Scholar 

  53. Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. ACL

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (62077027), the Ministry of Science and Technology of the People’s Republic of China(2018YFC2002500), the Jilin Province Development and Reform Commission, China (2019C053-1), the Education Department of Jilin Province, China (JJKH20200993K), the Department of Science and Technology of Jilin Province, China (20200801002GH), and the European Union’s Horizon 2020 FET Proactive project “WeNet-The Internet of us” (No. 823783).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Xu.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, R., Giunchiglia, F., Zhao, K. et al. Graph topology enhancement for text classification. Appl Intell 52, 15091–15104 (2022). https://doi.org/10.1007/s10489-021-03113-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-03113-8

Keywords

Navigation