Abstract
Graph neural networks (GNNs) are expressive in dealing with graph data. Because of the large storage requirements and the high computational complexity, it is difficult to deploy these cumbersome models in resource-constrained environments. As a representative model compression strategy, knowledge distillation (KD) is introduced into graph analysis research to address this problem. However, there are some crucial challenges in existing graph knowledge distillation algorithms, such as knowledge transfer effectiveness and student model designation. To address these problems, a new graph distillation model is proposed in this paper. Specifically, a layer-wise mapping strategy is designed to distill knowledge for training the student model, in which staged knowledge learned by intermediate layers of teacher GNNs is captured to form supervision signals. And, an adaptive weight mechanism is developed to evaluate the importance of the distilled knowledge. On this basis, a structure perception MLPs is constructed as the student model, which can capture prior information of the input graph from the perspectives of node feature and topology structure. In this way, the proposed model shares the prediction advantage of GNNs and the latency advantage of MLPs. Node classification experiments on five benchmark datasets demonstrate the validity and superiority of our model over baseline algorithms.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-024-02150-2/MediaObjects/13042_2024_2150_Fig7_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The relevant common experimental data mentioned in the paper can be accessed through the following link: https://github.com/BUPT-GAMMA/CPF/tree/master/data. We have strictly followed the rules for the use of these datasets, and have ensured that the data are legally available and used.
References
Beyer L, Zhai X, Royer A et al (2022) Knowledge distillation: a good teacher is patient and consistent. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10925–10934
Bruna J, Zaremba W, Szlam A et al (2014) Spectral networks and deep locally connected networks on graphs. In: Proceedings of the 2th international conference on learning representations
Chen D, Mei JP, Zhang Y et al (2021) Cross-layer distillation with semantic calibration. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 7028–7036
Chen H, Wang Y, Xu C et al (2020) Learning student networks via feature embedding. IEEE Trans Neural Netw Learn Syst 32(1):25–35
Chen M, Wei Z, Huang Z et al (2020) Simple and deep graph convolutional networks. In: Proceedings of the international conference on machine learning, pp 1725–1735
Choudhary T, Mishra V, Goswami A et al (2020) A comprehensive survey on model compression and acceleration. Artif Intell Rev 53(7):5113–5155
Deng X, Zhang Z (2021) Graph-free knowledge distillation for graph neural networks. In: Proceedings of the 30th international joint conference on artificial intelligence, pp 2321–2327
Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Sci Rev 40(100379):21
Dong Y, Zhang B, Yuan Y et al (2023) Reliant: fair knowledge distillation for graph neural networks. In: Proceedings of the 2023 SIAM international conference on data mining (SDM), society for industrial and applied mathematics, pp 154–162
Furlanello T, Lipton Z, Tschannen M et al (2018) Born again neural networks. In: Proceedings of the international conference on machine learning, pp 1607–1616
Gasteiger J, Bojchevski A, G¨unnemann S (2019) Predict then propagate: graph neural networks meet personalized pagerank. In: Proceedings of the international conference on learning representations
Gou J, Yu B, Maybank SJ et al (2021) Knowledge distillation: a survey. Int J Comput Vis 129:1789–1819
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 30th neural information processing systems
He H, Wang J, Zhang Z et al (2022) Compressing deep graph neural networks via adversarial knowledge distillation. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 534–544
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: CoRR
Huang Z, Tang Y, Chen Y (2022) A graph neural network-based node classification model on class-imbalanced graph data. Knowl-Based Syst 244
Jia J, Benson AR (2022) A unifying generative model for graph learning algorithms: label propagation, graph convolutions, and combinations. SIAM J Math Data Sci 4:100–105
Joshi CK, Liu F, Xun X et al (2022) On representation knowledge distillation for graph neural networks. IEEE transactions on neural networks and learning systems, pp 1–12
Kim J, Park NS, Kwak (2018) Paraphrasing complex network: network compression via factor transfer. In: Proceedings of the 31st conference on neural information processing systems, pp 2765–2774
Kim J, Jung J, Kang U (2021) Compressing deep graph convolution network with multi-staged knowledge distillation. PLoS ONE 16
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations 22
Li Q, Peng H, Li J et al (2022) A survey on text classification: from shallow to deep learning. ACM Trans Intell Syst Technol 13(2):31:1-31:41
Liu J, Zheng T, Hao Q (2022) Hire: distilling high-order relational knowledge from heterogeneous graph neural networks. Neurocomputing 507:67–83
Ma X, Wu J, Xue S et al (2023) A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans Knowl Data Eng 35(12):12012–12038
Mirzadeh SI, Farajtabar M, Li A et al (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the 34th AAAI conference on artificial intelligence, pp 5191–5198
Passalis N, Tzelepi M, Tefas A (2020) Probabilistic knowledge transfer for lightweight deep representation learning. IEEE Trans Neural Netw Learn Syst 32(5):2030–2039
Peng H, Wang H, Du B et al (2020) Spatial temporal incidence dynamic graph neural networks for traffic flow forecasting. Inf Sci 521:277–290
Peng H, Du B, Liu M et al (2021) Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning. Inf Sci 578:401–416
Romero A, Ballas N, Kahou SE et al (2014) Fitnets: hints for thin deep nets. In: Proceedings of the international conference on learning representations
Shah SM, Lau VK (2023) Model compression for communication efficient federated learning. IEEE Trans Neural Netw Learn Syst 34(9):5937–5951
Tian Y, Zhang C, Guo Z et al (2023) Learning mlps on graphs: a unified view of effectiveness, robustness, and efficiency. In: Proceedings of the eleventh international conference on learning representations
Veliˇckovi´c P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: Proceedings of the international conference on learning representations
Wu L, Cui P, Pei J et al (2022) Graph neural networks: foundation, frontiers and applications. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 4840–4841
Wu L, Lin H, Huang Y et al (2022) Knowledge distillation improves graph structure augmentation for graph neural networks. Adv Neural Inf Process Syst 35(11815–11827):23
Wu S, Sun F, Zhang W et al (2022) Graph neural networks in recommender systems: a survey. ACM Comput Surv 55(57):1–37
Yan B, Wang C, Guo G et al (2020) Tinygnn: learning efficient graph neural networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1848–1856
Yang C, Liu J, Shi C (2021) Extract the knowledge of graph neural networks and go beyond it: an effective knowledge distillation framework. Proc Web Conf 2021:1227–1237
Yang Y, Qiu J, Song M et al (2020) Distilling knowledge from graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7074–7083
Yuan H, Yu H, Gui S et al (2022) Explainability in graph neural networks: a taxonomic survey. IEEE Trans Pattern Anal Mach Intell 45(5):5782–5799
Zhang S, Liu Y, Sun Y et al (2022) Graph-less neural networks: teaching old mlps new tricks via distillation. In: Proceedings of the 10th international conference on learning representations, pp 2321–2327
Zhang W, Miao X, Shao Y et al (2020) Reliable data distillation on graph convolutional network. In: Proceedings of the 2020 ACM SIGMOD international conference on management of data, pp 1399–1414
Zhang Z, Cui P, Zhu W (2020) Deep learning on graphs: a survey. IEEE Trans Knowl Data Eng 34(1):249–270
Zhou J, Cui G, Hu S et al (2020) Graph neural networks: a review of methods and applications. AI open 1:57–81
Zhuang F, Moulin P (2023) Deep semi-supervised metric learning with mixed label propagation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3429–3438
Acknowledgements
The authors are very grateful to reviewers and editors for their suggestions. This work is supported by the National Natural Science Foundation of China (U21A20513, 62076154, 62276159, 62276161, T2122020), the Key R\&D Program of Shanxi Province (202202020101003, 202302010101007), and the Fundamental Research Program of Shanxi Province (202303021221055).
Author information
Authors and Affiliations
Contributions
Author Contributions Statement: Hangyuan Du: Proposing the method, Designing Experiments, Revising-original draft, Editing. Rong Yu: Designing the model, Designing Experiments, Writing-original draft, Revising-original draft. Liang Bai: Improving the model, Revising-original draft. Lu Bai: Experiments analysis, Improving language. Wenjian Wang: Preparing tables and pictures, Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, H., Yu, R., Bai, L. et al. Learning structure perception MLPs on graphs: a layer-wise graph knowledge distillation framework. Int. J. Mach. Learn. & Cyber. 15, 4357–4372 (2024). https://doi.org/10.1007/s13042-024-02150-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-024-02150-2