research-article

Graph-based Text Classification by Contrastive Learning with Text-level Graph Augmentation

Authors:

Meng WangAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data, Volume 18, Issue 4

Article No.: 77, Pages 1 - 21

https://doi.org/10.1145/3638353

Published: 13 February 2024 Publication History

Abstract

Text Classification (TC) is a fundamental task in the information retrieval community. Nowadays, the mainstay TC methods are built on the deep neural networks, which can learn much more discriminative text features than the traditional shallow learning methods. Among existing deep TC methods, the ones based on Graph Neural Network (GNN) have attracted more attention due to the superior performance. Technically, the GNN-based TC methods mainly transform the full training dataset to a graph of texts; however, they often neglect the dependency between words, so as to miss potential semantic information of texts, which may be significant to exactly represent them. To solve the aforementioned problem, we generate graphs of words instead, so as to capture the dependency information of words. Specifically, each text is translated into a graph of words, where neighboring words are linked. We learn the node features of words by a GNN-like procedure and then aggregate them as the graph feature to represent the current text. To further improve the text representations, we suggest a contrastive learning regularization term. Specifically, we generate two augmented text graphs for each original text graph, we constrain the representations of the two augmented graphs from the same text close and the ones from different texts far away. We propose various techniques to generate the augmented graphs. Upon those ideas, we develop a novel deep TC model, namely Text-level Graph Networks with Contrastive Learning (TGNcl). We conduct a number of experiments to evaluate the proposed TGNcl model. The empirical results demonstrate that TGNcl can outperform the existing state-of-the-art TC models.

References

[1]

Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. 2004. Learning multi-label scene classification. Pattern Recogn. 37, 9 (2004), 1757–1771.

[2]

Guibin Chen, Deheng Ye, Zhenchang Xing, Jieshan Chen, and Erik Cambria. 2017. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In Proceedings of the International Joint Conference on Neural Networks.2377–2383.

[3]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning.1597–1607.

[4]

Yifan Chen, Yang Wang, Pengjie Ren, Meng Wang, and Maarten de Rijke. 2022. Bayesian feature interaction selection for factorization machines. Artif. Intell. 302 (2022), 103589.

Digital Library

[5]

Yifan Chen, Yang Wang, Xiang Zhao, Jie Zou, and Maarten de Rijke. 2020. Block-aware item similarity models for top-N recommendation. ACM Trans. Inf. Syst. 38, 4 (2020), 42:1–42:26.

Digital Library

[6]

Thomas Cover and Peter Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory. 13, 1 (1967), 21–27.

Digital Library

[7]

Jacob Devlin, Mingwei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 4171–4186.

[8]

Terrance Devries and Graham W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017).

[9]

André Elisseeff and Jason Weston. 2001. A kernel method for multi-labelled classification. In Neural Information Processing Systems: Natural and Synthetic.681–687.

[10]

Hongchao Fang and Pengtao Xie. 2020. CERT: Contrastive self-supervised learning for language understanding. CoRR. abs/2005.12766 (2020).

[11]

Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised representation learning by predicting image rotations. In Proceedings of the International Conference on Learning Representations, Conference Track Proceedings.

[12]

Renchu Guan, Yonghao Liu, Xiaoyue Feng, and Ximing Li. 2021. VPALG: Paper-publication prediction with graph neural networks. In Proceedings of the International Conference on Information and Knowledge Management.617–626.

Digital Library

[13]

Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 297–304.

[14]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross B. Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition.9729–9738.

[15]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.

Digital Library

[16]

Mohsin Iqbal, Asim Karim, and Faisal Kamiran. 2019. Balancing prediction errors for robust sentiment classification. ACM Trans. Knowl. Discov. Data 13, 3 (2019), 33:1–33:21.

Digital Library

[17]

Himanshu Jain, Yashoteja Prabhu, and Manik Varma. 2016. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In Proceedings of the International Conference on Knowledge Discovery and Data Mining.935–944.

Digital Library

[18]

Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning.137–142.

Digital Library

[19]

Rie Johnson and Tong Zhang. 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 562–570.

[20]

Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomás Mikolov. 2017. Bag of tricks for efficient text classification. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics. 427–431.

[21]

Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the Annual Meeting of the Association for Computational Linguistics.655–665.

[22]

Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.1746–1751.

[23]

Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907 (2016).

[24]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then propagate: Graph neural networks meet personalized PageRank. In Proceedings of the International Conference on Learning Representations.

[25]

Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura E. Barnes, and Donald E. Brown. 2019. Text classification algorithms: A survey. Information 10, 4 (2019), 150.

[26]

Haejun Lee, Drew A. Hudson, Kangwook Lee, and Christopher D. Manning. 2020. SLM: Learning a discourse language representation with sentence unshuffling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.1551–1562.

[27]

Ji Young Lee and Franck Dernoncourt. 2016. Sequential short-text classification with recurrent and convolutional neural networks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 515–520.

[28]

Boyu Li, Ting Guo, Xingquan Zhu, Qian Li, Yang Wang, and Fang Chen. 2023. SGCCL: Siamese graph contrastive consensus learning for personalized recommendation. In Proceedings of the ACM International Conference on Web Search and Data Mining. 589–597.

Digital Library

[29]

Qian Li, Hao Peng, Jianxin Li, Congying Xia, Renyu Yang, Lichao Sun, Philip S. Yu, and Lifang He. 2020. A survey on text classification: From shallow to deep learning. CoRR. abs/2008.00364 (2020).

[30]

Ximing Li and Yang Wang. 2020. Recovering accurate labeling information from partially valid data for effective multi-label learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 1373–1380.

[31]

Yaojin Lin, Qinghua Hu, Jinghua Liu, Xingquan Zhu, and Xindong Wu. 2022. MULFE: Multi-label learning via label-specific feature space ensemble. ACM Trans. Knowl. Discov. Data 16, 1 (2022), 5:1–5:24.

Digital Library

[32]

Han Liu, Caixia Yuan, and Xiaojie Wang. 2020. Label-wise document pre-training for multi-label text classification. In Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing.641–653.

Digital Library

[33]

Naiyin Liu, Qianlong Wang, and Jiangtao Ren. 2021. Label-embedding bi-directional attentive model for multi-label text classification. Neural Process. Lett. 53, 1 (2021), 375–389.

Digital Library

[34]

Pengfei Liu, Xipeng Qiu, Xinchi Chen, Shiyu Wu, and Xuanjing Huang. 2015. Multi-timescale long short-term memory neural network for modelling sentences and documents. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.2326–2335.

[35]

Qianwen Ma, Chunyuan Yuan, Wei Zhou, and Songlin Hu. 2021. Label-specific dual graph neural network for multi-label text classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 3855–3864.

[36]

Jinseok Nam, Eneldo Loza Mencía, Hyunwoo J. Kim, and Johannes Fürnkranz. 2017. Maximizing subset accuracy with recurrent neural networks in multi-label classification. In Advances in Neural Information Processing Systems. 5413–5423.

[37]

Yoshiki Niwa and Yoshihiko Nitta. 1994. Co-occurrence vectors from corpora vs. distance vectors from dictionaries. In Proceedings of the International Conference on Computational Linguistics.304–309.

Digital Library

[38]

Ankit Pal, Muru Selvakumar, and Malaikannan Sankarasubbu. 2020. MAGNET: Multi-label text classification using attention-based graph neural network. In International Conference on Agents and Artificial Intelligence.494–505.

[39]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.1532–1543.

[40]

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 2227–2237.

[41]

Kechen Qin, Cheng Li, Virgil Pavlu, and Javed A. Aslam. 2019. Adapting RNN sequence prediction model to multi-label set prediction. In Conference of the North American Chapter of the Association for Computational Linguistics. 3181–3190.

[42]

Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. 2011. Classifier chains for multi-label classification. Mach. Learn. 85, 3 (2011), 333–359.

Digital Library

[43]

Stephen E. Robertson and Karen Spärck Jones. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 3 (1976), 129–146.

[44]

Andras Rozsa, Ethan M. Rudd, and Terrance E. Boult. 2016. Adversarial diversity and hard positive generation. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops.410–417.

[45]

Timothy N. Rubin, America Chambers, Padhraic Smyth, and Mark Steyvers. 2012. Statistical topic models for multi-label document classification. Mach. Learn. 88, 1-2 (2012), 157–208.

Digital Library

[46]

Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, and Weizhu Chen. 2020. A simple but tough-to-beat data augmentation approach for natural language understanding and generation. CoRR abs/2009.13818 (2020).

[47]

Xi’ao Su, Ran Wang, and Xinyu Dai. 2022. Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 672–679.

[48]

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 1556–1566.

[49]

Che-Ping Tsai and Hung-yi Lee. 2020. Order-free learning alleviating exposure bias in multi-label classification. In Proceedings of the AAAI Conference on Artificial Intelligence.6038–6045.

[50]

Grigorios Tsoumakas and Ioannis Katakis. 2007. Multi-label classification: An overview. Int. J. Data Warehous. Min. 3, 3 (2007), 1–13.

[51]

Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. CoRR abs/1807.03748 (2018).

[52]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.

[53]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2017. Graph attention networks. CoRR abs/1710.10903 (2017).

[54]

Bing Wang, Liang Ding, Qihuang Zhong, Ximing Li, and Dacheng Tao. 2022. A contrastive cross-channel data augmentation framework for aspect-based sentiment analysis. In Proceedings of the International Conference on Computational Linguistics. 6691–6704.

[55]

Bing Wang, Ximing Li, Zhiyao Yang, Yuanyuan Guan, Jiayin Li, and Shengsheng Wang. 2023. Unsupervised sentence representation learning with frequency-induced adversarial tuning and incomplete sentence filtering. CoRR abs/2305.08655 (2023).

[56]

Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. 2018. Joint embedding of words and labels for text classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2321–2331.

[57]

Yequan Wang, Aixin Sun, Jialong Han, Ying Liu, and Xiaoyan Zhu. 2018. Sentiment analysis by capsules. In Proceedings of the World Wide Web Conference on World Wide Web. 1165–1174.

Digital Library

[58]

Lianghao Xia, Chao Huang, Yong Xu, Jiashu Zhao, Dawei Yin, and Jimmy X. Huang. 2022. Hypergraph contrastive collaborative filtering. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 70–79.

Digital Library

[59]

Lin Xiao, Xin Huang, Boli Chen, and Liping Jing. 2019. Label-specific document representation for multi-label text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 466–475.

[60]

Yuanmeng Yan, Rumei Li, Sirui Wang, Fuzheng Zhang, Wei Wu, and Weiran Xu. 2021. ConSERT: A contrastive framework for self-supervised sentence representation transfer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 5065–5075.

[61]

Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, and Houfeng Wang. 2018. SGM: Sequence generation model for multi-label classification. In Proceedings of the International Conference on Computational Linguistics.3915–3926.

[62]

Zichao Yang, Diyi Yang, hris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 1480–1489.

[63]

Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence. 7370–7377.

Digital Library

[64]

Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. In Advances in Neural Information Processing Systems. 5812–5823.

[65]

Chengyuan Zhang, Yang Wang, Lei Zhu, Jiayu Song, and Hongzhi Yin. 2021. Multi-graph heterogeneous interaction fusion for social recommendation. ACM Trans. Inf. Syst. 40, 2 (2021), 1–26.

Digital Library

[66]

Minling Zhang and Zhihua Zhou. 2006. Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18, 10 (2006), 1338–1351.

Digital Library

[67]

Minling Zhang and Zhihua Zhou. 2007. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recogn. 40, 4 (2007), 2038–2048.

Digital Library

[68]

Minling Zhang and Zhihua Zhou. 2014. A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 8 (2014), 1819–1837.

[69]

Qianwen Zhang, Ximing Zhang, Zhao Yan, Ruifang Liu, Yunbo Cao, and Minling Zhang. 2021. Correlation-guided representation for multi-label text classification. In Proceedings of the International Joint Conference on Artificial Intelligence. 3363–3369.

[70]

Ximing Zhang, Qian-Wen Zhang, Zhao Yan, Ruifang Liu, and Yunbo Cao. 2021. Enhancing label correlation feedback in multi-label text classification via multi-task learning. In Findings of the Association for Computational Linguistics.1190–1200.

[71]

Daniel Zügner, Oliver Borchert, Amir Akbarnejad, and Stephan Günnemann. 2020. Adversarial attacks on graph neural networks: Perturbations and their patterns. ACM Trans. Knowl. Discov. Data 14, 5 (2020), 57:1–57:31.

Digital Library

Cited By

Zhang MSong RLi XTavares AXu H(2025)Few-shot Hierarchical Text Classification with Bidirectional Path Constraint by label weightingPattern Recognition Letters10.1016/j.patrec.2025.01.025190(81-88)Online publication date: Apr-2025
https://doi.org/10.1016/j.patrec.2025.01.025
Wang BLi XLi CWang SGao W(2024)Escaping the neutralization effect of modality features fusion in multimodal Fake News DetectionInformation Fusion10.1016/j.inffus.2024.102500111(102500)Online publication date: Nov-2024
https://doi.org/10.1016/j.inffus.2024.102500
Shin JKwak JJung J(2024)A criteria-based classification model using augmentation and contrastive learning for analyzing imbalanced statement dataHeliyon10.1016/j.heliyon.2024.e3292910:12(e32929)Online publication date: Jun-2024
https://doi.org/10.1016/j.heliyon.2024.e32929

Index Terms

Graph-based Text Classification by Contrastive Learning with Text-level Graph Augmentation
1. Computing methodologies
  1. Machine learning

Recommendations

Attributed graph clustering under the contrastive mechanism with cluster-preserving augmentation
Abstract
Attributed graph clustering is a fundamental task in complex network analysis. Many existing graph clustering methods utilize graph representation learning techniques to learn node representations, subsequently applying K-means for clustering. ...
Highlights
- An attributed graph clustering method under the contrastive mechanism is proposed.
- A cluster-aware contrasting view based on EBC and kNN graph is built.
- Multilevel contrast and self-supervised clustering are jointly optimized.
- ...
Video Representation Learning with Graph Contrastive Augmentation
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Contrastive-based self-supervised learning for image representations has significantly closed the gap with supervised learning. A natural extension of image-based contrastive learning methods to the video domain is to fully exploit the temporal ...
Graph connectivity and its augmentation: applications of MA orderings

This paper surveys how the maximum adjacency (MA) ordering of the vertices in a graph can be used to solve various graph problems. We first explain that the minimum cut problem can be solved efficiently by utilizing the MA ordering. The idea is then ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 18, Issue 4

May 2024

707 pages

EISSN:1556-472X

DOI:10.1145/3613622

Editor:
Jian Pei
Duke University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 February 2024

Online AM: 22 December 2023

Accepted: 15 December 2023

Received: 14 October 2022

Published in TKDD Volume 18, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China (NSFC)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
857
Total Downloads

Downloads (Last 12 months)694
Downloads (Last 6 weeks)52

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang MSong RLi XTavares AXu H(2025)Few-shot Hierarchical Text Classification with Bidirectional Path Constraint by label weightingPattern Recognition Letters10.1016/j.patrec.2025.01.025190(81-88)Online publication date: Apr-2025
https://doi.org/10.1016/j.patrec.2025.01.025
Wang BLi XLi CWang SGao W(2024)Escaping the neutralization effect of modality features fusion in multimodal Fake News DetectionInformation Fusion10.1016/j.inffus.2024.102500111(102500)Online publication date: Nov-2024
https://doi.org/10.1016/j.inffus.2024.102500
Shin JKwak JJung J(2024)A criteria-based classification model using augmentation and contrastive learning for analyzing imbalanced statement dataHeliyon10.1016/j.heliyon.2024.e3292910:12(e32929)Online publication date: Jun-2024
https://doi.org/10.1016/j.heliyon.2024.e32929

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents