Abstract
Graph representation learning has attracted lots of attention recently. Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs. Thus, it remains a great challenge to capture rich information in large-scale graph data. Besides, these methods mainly focus on supervised learning and highly depend on node label information, which is expensive to obtain in the real world. As to unsupervised network embedding approaches, they overemphasize node proximity instead, whose learned representations can hardly be used in downstream application tasks directly. In recent years, emerging self-supervised learning provides a potential solution to address the aforementioned problems. However, existing self-supervised works also operate on the complete graph data and are biased to fit either global or very local (1-hop neighborhood) graph structures in defining the mutual information-based loss terms. In this paper, a novel self-supervised representation learning method via Sub-graph Contrast, namely Subg-Con, is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information. Instead of learning on the complete input graph data, with a novel data augmentation strategy, Subg-Con learns node representations through a contrastive loss defined based on subgraphs sampled from the original graph instead. Besides, we further enhance the subgraph representation learning via mutual information maximum to preserve more topology and feature information. Compared with existing graph representation learning approaches, Subg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization. Extensive experiments verify both the effectiveness and the efficiency of our work. We compared it with both classic and state-of-the-art graph representation learning approaches. Various downstream tasks are done on multiple real-world large-scale benchmark datasets from different domains.
Similar content being viewed by others
References
Abu-El-Haija S, Perozzi B, Kapoor A, Alipourfard N, Lerman K, Harutyunyan H, Steeg GV, Galstyan A (2019) Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. arXiv preprint arXiv:1905.00067
Asano YM, Rupprecht C, Vedaldi A (2019) A critical analysis of self-supervision, or what we can learn from a single image. arXiv preprint arXiv:1904.13132
Chen J, Ma T, Xiao C (2018) Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709
Dupont P, Callut J, Dooms G, Monette JN, Deville Y, Sainte B (2006) Relevant subgraph extraction from random walks in a graph. Universite Catholique de Louvain, UCL/INGI, Number RR, p 7
Fey M, Lenssen JE (2019) Fast graph representation learning with PyTorch geometric. In: ICLR workshop on representation learning on graphs and manifolds
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 855–864
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, pp 271–279
Jiao Y, Xiong Y, Zhang J, Zhu Y (2019) Collective link prediction oriented network embedding with hierarchical graph attention. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 419–428
Jing L, Tian Y (2020) Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell
Ketkar N (2017) Introduction to pytorch. In: Deep learning with python. Springer, pp 195–208
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308
Lee J, Lee I, Kang J (2019) Self-attention graph pooling. arXiv preprint arXiv:1904.08082
Liao R, Zhao Z, Urtasun R, Zemel RS (2019) Lanczosnet: multi-scale deep graph convolutional networks. arXiv preprint arXiv:1901.01484
Meng L, Yang Bai J, Zhang J (2019) Latte: application oriented social network embedding. In: 2019 IEEE international conference on big data (Big Data), pp 1169–1174
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, Huang J (2020) Graph representation learning via graphical mutual information maximization. In: Proceedings of the web conference, pp 259–270
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 701–710
Qu M, Bengio Y, Tang J (2019) Gmnn: graph markov neural networks. In: International conference on machine learning, pp 5241–5250
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2012) Bpr: bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):60
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2017) Graph attention networks. In: CoRR. arXiv:1710.10903
Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2018) Deep graph infomax. arXiv preprint arXiv:1809.10341
Wang S, He L, Cao B, Lu CT, Yu PS, Ragin AB (2017) Structural deep brain network mining. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 475–484
Wu F, Souza Jr AH, Zhang T, Fifty C, Yu T, Weinberger KQ (2019) Simplifying graph convolutional networks. In: ICML
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826
Yizhu J, Yun X, Jiawei Z, Yao Z, Tianqi Z, Yangyong Z (2020) Sub-graph contrast for scalable self-supervised graph representation learning. In: 2020 IEEE international conference on data mining (ICDM). IEEE
Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) Graphsaint: graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931
Zhang J, Meng L (2019) Gresnet: graph residual network for reviving deep gnns from suspended animation. arXiv abs/1909.05729
Zhang J, Zhang H, Sun L, Xia C (2020) Graph-bert: only attention is needed for learning graph representations. arXiv preprint arXiv:2001.05140
Acknowledgements
We appreciate the comments from anonymous reviewers which will help further improve our work. This work is funded in part by the National Natural Science Foundation of China Projects No. U1636207 and No. U1936213. This work is also partially supported by NSF through grant IIS-1763365, IIS-2106972 and by UC Davis.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiao, Y., Xiong, Y., Zhang, J. et al. Scalable self-supervised graph representation learning via enhancing and contrasting subgraphs. Knowl Inf Syst 64, 235–260 (2022). https://doi.org/10.1007/s10115-021-01635-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01635-8