ABSTRACT
While Graph Neural Networks (GNNs) have become de facto criterion in graph representation learning, they still suffer from label scarcity and poor generalization. To alleviate these issues, graph pre-training has been proposed to learn universal patterns from unlabeled data via applying self-supervised tasks. Most existing graph pre-training methods only use a single self-supervised task, which will lead to insufficient knowledge mining. Recently, there are also some works that try to use multiple self-supervised tasks, however, we argue that these methods still suffer from a serious problem, which we call it graph structure impairment. That is, there actually exists structural gaps among several tasks due to the divergence of optimization objectives, which means customized graph structures should be provided for different self-supervised tasks. Graph structure impairment not only significantly hurts the generalizability of pre-trained GNNs, but also leads to suboptimal solution, and there is no study so far to address it well. Motivated by Meta-Cognitive theory, we propose a novel model named Core Graph Cognizing and Differentiating (CORE) to deal with the problem in an effective approach. Specifically, CORE consists of cognizing network and differentiating process, the former cognizes a core graph which stands for the essential structure of the graph, and the latter allows it to differentiate into several task-specific graphs for different tasks. Besides, this is also the first study to combine graph pre-training with cognitive theory to build a cognition-aware model. Several experiments have been conducted to demonstrate the effectiveness of CORE.
Supplemental Material
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.Google Scholar
- Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2014. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML. 647--655.Google Scholar
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML. 1126--1135.Google Scholar
- John H Flavell. 1982. On cognitive development. Child development (1982), 1--10.Google Scholar
- Damien S Fleur, Bert Bredeweg, and Wouter van den Bos. 2021. Metacognition: ideas and insights from neuro-and educational sciences. npj Science of Learning, Vol. 6, 1 (2021), 1--11.Google Scholar
- Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In ICML. 1263--1272.Google Scholar
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS. 1024--1034.Google Scholar
- R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In ICLR.Google Scholar
- Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020c. Open graph benchmark: Datasets for machine learning on graphs. In NeurIPS, Vol. 33. 22118--22133.Google Scholar
- Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2020d. Strategies for pre-training graph neural networks. In ICLR.Google Scholar
- Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun. 2020b. Gpt-gnn: Generative pre-training of graph neural networks. In SIGKDD. 1857--1867.Google ScholarDigital Library
- Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020a. Heterogeneous graph transformer. In WWW. 2704--2710.Google Scholar
- Hong Huang, Zixuan Fang, Xiao Wang, Youshan Miao, and Hai Jin. 2020. Motif-Preserving Temporal Network Embedding. In IJCAI. 1237--1243.Google Scholar
- Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. In ICLR.Google Scholar
- Xunqiang Jiang, Yuanfu Lu, Yuan Fang, and Chuan Shi. 2021. Contrastive Pre-Training of GNNs on Heterogeneous Graphs. In CIKM. 803--812.Google Scholar
- Machiel Keestra. 2017. Metacognition and reflection by interdisciplinary experts: Insights from cognitive science and philosophy. Issues in Interdisciplinary Studies, Vol. 35 (2017).Google Scholar
- Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. In CoRR, Vol. abs/1611.07308.Google Scholar
- Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.Google Scholar
- John Boaz Lee, Ryan Rossi, and Xiangnan Kong. 2018. Graph classification using structural attention. In SIGKDD. 1666--1674.Google Scholar
- Simon Leys. 1997. The analects of Confucius. WW Norton & Company.Google Scholar
- Ruoyu Li, Sheng Wang, Feiyun Zhu, and Junzhou Huang. 2018. Adaptive graph convolutional neural networks. In AAAI, Vol. 32. 3546--3553.Google Scholar
- Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, and Jie Tang. 2021b. Self-supervised learning: Generative or contrastive. TKDE (2021).Google Scholar
- Zhijun Liu, Chao Huang, Yanwei Yu, and Junyu Dong. 2021a. Motif-preserving dynamic attributed network embedding. In WWW. 1629--1638.Google Scholar
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. In ICLR.Google Scholar
- Yuanfu Lu, Xunqiang Jiang, Yuan Fang, and Chuan Shi. 2021. Learning to pre-train graph neural networks. In AAAI. 4276--4284.Google Scholar
- Nicolò Navarin, Dinh V Tran, and Alessandro Sperduti. 2018. Pre-training graph neural networks with kernels. In CoRR, Vol. abs/1811.06930.Google Scholar
- Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, and Junzhou Huang. 2020. Graph representation learning via graphical mutual information maximization. In WWW. 259--270.Google Scholar
- Paul R Pintrich. 2002. The role of metacognitive knowledge in learning, teaching, and assessing. Theory into practice, Vol. 41, 4 (2002), 219--225.Google Scholar
- Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In SIGKDD. 1150--1160.Google ScholarDigital Library
- Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. In CoRR, Vol. abs/1811.05868.Google Scholar
- Leslie N Smith and Nicholay Topin. 2019. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications, Vol. 11006. 1100612.Google Scholar
- Neil A Stillings, Christopher H Chase, Steven E Weisler, Mark H Feinstein, Jay L Garfield, and Edwina L Rissland. 1995. Cognitive science: An introduction. MIT press.Google Scholar
- Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In ICLR.Google Scholar
- Paul Thagard. 2005. Mind: Introduction to cognitive science. MIT press.Google Scholar
- Aaron Van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. In CoRR, Vol. abs/1807.03748.Google Scholar
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. In ICLR.Google Scholar
- Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. In ICLR.Google Scholar
- Barbara Von Eckardt. 1995. What is cognitive science? MIT press.Google Scholar
- Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In ICML. 6861--6871.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.Google Scholar
- Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. When does self-supervision help graph convolutional networks?. In ICML. 10871--10880.Google Scholar
- Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. 2019. Graph transformer networks. In NeurIPS, Vol. 32. 11960--11970.Google Scholar
- Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2020. Graphsaint: Graph sampling based inductive learning method. In ICLR.Google Scholar
- Jiawei Zhang, Haopeng Zhang, Congying Xia, and Li Sun. 2020. Graph-bert: Only attention is needed for learning graph representations. In CoRR, Vol. abs/2001.05140.Google Scholar
- Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In NeurIPS, Vol. 31. 5171--5181.Google Scholar
Index Terms
- Cognize Yourself: Graph Pre-Training via Core Graph Cognizing and Differentiating
Recommendations
GPT-GNN: Generative Pre-Training of Graph Neural Networks
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningGraph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the ...
Adaptive Transfer Learning on Graph Neural Networks
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningGraph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. ...
ALEX: Towards Effective Graph Transfer Learning with Noisy Labels
MM '23: Proceedings of the 31st ACM International Conference on MultimediaGraph Neural Networks (GNNs) have garnered considerable interest due to their exceptional performance in a wide range of graph machine learning tasks. Nevertheless, the majority of GNN-based approaches have been examined using well-annotated benchmark ...
Comments