Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs

Jiao, Yizhu; Xiong, Yun; Zhang, Jiawei; Zhang, Yao; Zhang, Tianqi; Zhu, Yangyong

doi:10.1007/s10115-021-01635-8

Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs

Regular Paper
Published: 19 January 2022

Volume 64, pages 235–260, (2022)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yizhu Jiao ORCID: orcid.org/0000-0003-0509-8652¹,
Yun Xiong^1,2,
Jiawei Zhang³,
Yao Zhang¹,
Tianqi Zhang¹ &
…
Yangyong Zhu^1,2

720 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Graph representation learning has attracted lots of attention recently. Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs. Thus, it remains a great challenge to capture rich information in large-scale graph data. Besides, these methods mainly focus on supervised learning and highly depend on node label information, which is expensive to obtain in the real world. As to unsupervised network embedding approaches, they overemphasize node proximity instead, whose learned representations can hardly be used in downstream application tasks directly. In recent years, emerging self-supervised learning provides a potential solution to address the aforementioned problems. However, existing self-supervised works also operate on the complete graph data and are biased to fit either global or very local (1-hop neighborhood) graph structures in defining the mutual information-based loss terms. In this paper, a novel self-supervised representation learning method via Sub-graph Contrast, namely Subg-Con, is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information. Instead of learning on the complete input graph data, with a novel data augmentation strategy, Subg-Con learns node representations through a contrastive loss defined based on subgraphs sampled from the original graph instead. Besides, we further enhance the subgraph representation learning via mutual information maximum to preserve more topology and feature information. Compared with existing graph representation learning approaches, Subg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization. Extensive experiments verify both the effectiveness and the efficiency of our work. We compared it with both classic and state-of-the-art graph representation learning approaches. Various downstream tasks are done on multiple real-world large-scale benchmark datasets from different domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HetGNN-SF: Self-supervised learning on heterogeneous graph neural network via semantic strength and feature similarity

Article 16 June 2023

Chao Li, Xinming Liu, … Qingtian Zeng

Learning Representations for Bipartite Graphs Using Multi-task Self-supervised Learning

GDM: Dual Mixup for Graph Classification with Limited Supervision

References

Abu-El-Haija S, Perozzi B, Kapoor A, Alipourfard N, Lerman K, Harutyunyan H, Steeg GV, Galstyan A (2019) Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. arXiv preprint arXiv:1905.00067
Asano YM, Rupprecht C, Vedaldi A (2019) A critical analysis of self-supervision, or what we can learn from a single image. arXiv preprint arXiv:1904.13132
Chen J, Ma T, Xiao C (2018) Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709
Dupont P, Callut J, Dooms G, Monette JN, Deville Y, Sainte B (2006) Relevant subgraph extraction from random walks in a graph. Universite Catholique de Louvain, UCL/INGI, Number RR, p 7
Fey M, Lenssen JE (2019) Fast graph representation learning with PyTorch geometric. In: ICLR workshop on representation learning on graphs and manifolds
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 855–864
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, pp 271–279
Jiao Y, Xiong Y, Zhang J, Zhu Y (2019) Collective link prediction oriented network embedding with hierarchical graph attention. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 419–428
Jing L, Tian Y (2020) Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell
Ketkar N (2017) Introduction to pytorch. In: Deep learning with python. Springer, pp 195–208
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308
Lee J, Lee I, Kang J (2019) Self-attention graph pooling. arXiv preprint arXiv:1904.08082
Liao R, Zhao Z, Urtasun R, Zemel RS (2019) Lanczosnet: multi-scale deep graph convolutional networks. arXiv preprint arXiv:1901.01484
Meng L, Yang Bai J, Zhang J (2019) Latte: application oriented social network embedding. In: 2019 IEEE international conference on big data (Big Data), pp 1169–1174
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, Huang J (2020) Graph representation learning via graphical mutual information maximization. In: Proceedings of the web conference, pp 259–270
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 701–710
Qu M, Bengio Y, Tang J (2019) Gmnn: graph markov neural networks. In: International conference on machine learning, pp 5241–5250
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2012) Bpr: bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):60
Article Google Scholar
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2017) Graph attention networks. In: CoRR. arXiv:1710.10903
Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2018) Deep graph infomax. arXiv preprint arXiv:1809.10341
Wang S, He L, Cao B, Lu CT, Yu PS, Ragin AB (2017) Structural deep brain network mining. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 475–484
Wu F, Souza Jr AH, Zhang T, Fifty C, Yu T, Weinberger KQ (2019) Simplifying graph convolutional networks. In: ICML
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826
Yizhu J, Yun X, Jiawei Z, Yao Z, Tianqi Z, Yangyong Z (2020) Sub-graph contrast for scalable self-supervised graph representation learning. In: 2020 IEEE international conference on data mining (ICDM). IEEE
Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) Graphsaint: graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931
Zhang J, Meng L (2019) Gresnet: graph residual network for reviving deep gnns from suspended animation. arXiv abs/1909.05729
Zhang J, Zhang H, Sun L, Xia C (2020) Graph-bert: only attention is needed for learning graph representations. arXiv preprint arXiv:2001.05140

Download references

Acknowledgements

We appreciate the comments from anonymous reviewers which will help further improve our work. This work is funded in part by the National Natural Science Foundation of China Projects No. U1636207 and No. U1936213. This work is also partially supported by NSF through grant IIS-1763365, IIS-2106972 and by UC Davis.

Author information

Authors and Affiliations

Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China
Yizhu Jiao, Yun Xiong, Yao Zhang, Tianqi Zhang & Yangyong Zhu
Shanghai Institute for Advanced Communication and Data Science, Fudan University, Shanghai, China
Yun Xiong & Yangyong Zhu
IFM Lab, Department of Computer Science, University of California, Davis, CA, USA
Jiawei Zhang

Authors

Yizhu Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Yun Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tianqi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yangyong Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yizhu Jiao or Yun Xiong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiao, Y., Xiong, Y., Zhang, J. et al. Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs. Knowl Inf Syst 64, 235–260 (2022). https://doi.org/10.1007/s10115-021-01635-8

Download citation

Received: 16 March 2021
Revised: 25 November 2021
Accepted: 26 November 2021
Published: 19 January 2022
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10115-021-01635-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs

Abstract

Access this article

Similar content being viewed by others

HetGNN-SF: Self-supervised learning on heterogeneous graph neural network via semantic strength and feature similarity

Learning Representations for Bipartite Graphs Using Multi-task Self-supervised Learning

GDM: Dual Mixup for Graph Classification with Limited Supervision

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs

Abstract

Access this article

Similar content being viewed by others

HetGNN-SF: Self-supervised learning on heterogeneous graph neural network via semantic strength and feature similarity

Learning Representations for Bipartite Graphs Using Multi-task Self-supervised Learning

GDM: Dual Mixup for Graph Classification with Limited Supervision

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation