Abstract
Graph embeddings have been tremendously successful at producing node representations that are discriminative for downstream tasks. In this paper, we study the problem of graph transfer learning: given two graphs and labels in the nodes of the first graph, we wish to predict the labels on the second graph. We propose a tractable, non-combinatorial method for solving the graph transfer learning problem by combining classification and embedding losses with a continuous, convex penalty motivated by tractable graph distances. We demonstrate that our method successfully predicts labels across graphs with almost perfect accuracy; in the same scenarios, training embeddings through standard methods leads to predictions that are no better than random.
Similar content being viewed by others
References
Ahmed A, Shervashidze N, Narayanamurthy SM, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international world wide web conference, WWW, 2013, pp 37–48
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 855–864
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Bento J, Ioannidis S (2019) A family of tractable graph metrics. Appl Netw Sci 4(1):107-1–107-27
Birkhoff G (1946) Tres observaciones sobre el algebra lineal [three observations on linear algebra]. Revista - Universidad Nacional de Tucumán, Serie A 5:147–151
Cao S, Lu W, Xu Q (2015) GraRep: learning graph representations with global structural information. In: Proceedings of the 24th ACM international conference on information and knowledge management, CIKM, pp 891–900
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. IEEE Data Eng Bull 40(3):52–74
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl Based Syst 151:78–94
Cai H, Zheng VW, Chang KC (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Li Y, Tarlow D, Brockschmidt M, Zemel RS (2016) Gated graph sequence neural networks. In: Proceedings of the 4th international conference on learning representations, ICLR,
Hamilton W L, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the annual conference on neural information processing systems, NeurIPS, pp 1024–1034
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations, ICLR
Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond Euclidean data. IEEE Signal Process Mag 34(4):18–42
Kipf TN, Welling M (2016) Variational graph auto-encoders. In: Proceedings of the NeurIPS Bayesian deep learning workshop
Pratt L, Jennings B (1996) A survey of connectionist network reuse through transfer. In Learning to Learn, 1996, pp 19–43
Do CB, Ng AY(2005) Transfer learning for text classification. Adv Neural Inf Process Syst pp 299–306
Wan C, Pan R, Li J (2011) Bi-weighting domain adaptation for cross-language text classification. In: Proceedings of the 21d international joint conference on artificial intelligence, IJCAI
Lu Z, Zhu Y, Pan SJ, Xiang EW, Wang Y, Yang Q (2014) Source free transfer learning for text classification. In: Proceedings of the 28th conference on artificial intelligence, AAAI, pp 122–128
Lee J, Kim H, Lee J, Yoon S (2017) Transfer learning for deep learning on graph-structured data. In: Proceedings of the 31st conference on artificial intelligence, AAAI, pp 2154–2160
Gong K, Gao Y, Liang X, Shen X, Wang M, Lin L (2019) Graphonomy: universal human parsing via graph transfer learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVP, pp 7450–7459
Verma S, Zhang Z-L (2019) Learning universal graph neural network embeddings with aid of transfer learning, arXiv preprint arXiv:1909.10086
Banerjee B, Stone P (2007) General game learning using knowledge transfer. In: Proceedings of the 20th international joint conference on artificial intelligence, IJCAI, pp 672–677
Kuhlmann G, Stone P (2007) Graph-based domain mapping for transfer learning in general games. In: Proceedings of the 18th European conference on machine learning, ECML, pp 188–200
Long M, Wang J, Ding G, Shen D, Yang Q (2013) Transfer learning with graph co-regularization. IEEE Trans Knowl Data Eng 26(7):1805–1818
Piao G, Breslin JG (2018) Transfer learning for item recommendations and knowledge graph completion in item related domains via a co-factorization model. In: Proceedings of the 15th extended semantic web conference, ESWC, pp 496–511
Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(3):265–298
Allen FH (2002) The cambridge structural database: a quarter of a million crystal structures and rising. Acta Crystallogr B 58(3):380–388
Kvasnička V, Pospíchal J, Baláž V (1991) Reaction and chemical distances and reaction graphs. Theoret Chem Account Theory Comput Model 79(1):65–79
Macindoe O, Richards W (2010) Graph comparison using fine structure analysis. In Proceedings of the IEEE 2nd international conference on social computing, SocialCom, 2010, pp 193–200
Faloutsos C, Koutra D, Vogelstein JT(2013) DELTACON: a principled massive-graph similarity function. In: Proceedings of the 13th SIAM international conference on data mining, ICDM, pp 162–170
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co,
Fischer A, Suen CY, Frinken V, Riesen K, Bunke H (2015) Approximation of graph edit distance based on hausdorff matching. Pattern Recogn 48(2):331–343
Bunke H (1997) On a relation between graph edit distance and maximum common subgraph. Pattern Recogn Lett 18(8):689–694
Bunke H, Shearer K (1998) A graph distance metric based on the maximal common subgraph. Pattern Recogn Lett 19(3–4):255–259
Chartrand G, Kubicki G, Schultz M (1998) Graph similarity and distance in graphs. Aequationes Mathematicae
Jain BJ (2016) On the geometry of graph spaces. Discret Appl Math 214:126–144
Koca J, Kratochvil M, Kvasnicka V, Matyska L, Pospichal J (2012) Synthon model of organic chemistry and synthesis design. Springer Science & Business Media, vol 51
Riesen K, Neuhaus M, Bunke K (2007) Graph embedding in vector spaces by means of prototype selection. Graph-Based Represent Pattern Recogn 4538:383–393
Riesen K, Bunke H (2010) Graph Classification and clustering based on vector space embedding. World Scientific
Ferrer M, Valveny E, Serratosa F, Riesen K, Bunke H (2010) Generalized median graph computation by means of graph embedding in vector spaces. Pattern Recogn 43(4):1642–1655
Zhu P, Wilson RC (2005) A study of graph spectra for comparing graphs. In: Proceedings of the the British machine vision conference, BMVC
Wilson RC, Zhu P (2008) A study of graph spectra for comparing graphs and trees. Pattern Recogn 41:2833–2841
Elghawalby H, Hancock ER (2008) Measuring graph similarity using spectral geometry. In: Proceedings of the 5th international conference on image analysis and recognition, ICIAR, 2008, pp 517–526
Zhang S, Tong H (2016) Final: fast attributed network alignment. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 1345–1354
Riesen K, Bunke H (2009) Approximate graph edit distance computation by means of bipartite graph matching. Image Vis Comput 27(7):950–959
Fankhauser S, Riesen K, Bunke H (2011) Speeding up graph edit distance computation through fast bipartite matching. In: Proceedings of the 8th international workshop on graph-based representations in pattern recognition, GbRPR, pp 102–111
Heimann M, Shen H, Safavi T, Koutra D (2018) REGAL: representation learning-based graph alignment. In: Proceedings of the 27th ACM international conference on information and knowledge management, CIKM, pp 117–126
Chen X, Heimann M, Vahedian F, Koutra D (2020) CONE-align: consistent network alignment with proximity-preserving node embedding. In: Proceedings of the The 29th ACM international conference on information and knowledge management, CIKM, pp 1985–1988
Kempe D, Kleinberg JM, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, 2003, pp. 137–146
Myers SA, Leskovec J (2010) On the convexity of latent social network inference. In: Proceedings of the 24th annual conference on neural information processing systems, NeurIPS, pp 1741–1749
Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. ACM Trans Knowl Discov Data 5(4):211–2137
Abrahao B D, Chierichetti F, Kleinberg R, Panconesi A (2013) Trace complexity of network inference. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 491–499
Morin F, Bengio Y (2005) Hierarchical probabilistic neural network language model. In: Proceedings of the 10th international workshop on artificial intelligence and statistics, AISTATS
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th annual conference on neural information processing systems, NeurIPS, pp 3111–3119
Gold S, Rangarajan A (1996) A graduated assignment algorithm for graph matching. IEEE Trans Pattern Anal Mach Intell 18(4):377–388
Wiskott L, Fellous J, Krüger N, von der Malsburg C (1997) Face recognition by elastic bunch graph matching. In: Proceedings of the 7th international conference on computer analysis of images and patterns, CAIP, vol 1296, pp 456–463
Babai L (2016) Graph isomorphism in quasipolynomial time [extended abstract]. In Proceedings of the 48th annual ACM SIGACT symposium on theory of computing, STOC, 2016, pp 684–697
Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logist Quart 3(1–2):95–110
Boyd SP, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Andersen M, Dahl J, Liu Z, Vandenberghe L, Sra S, Nowozin S, Wright S (2011)Interior-point methods for large-scale cone programming. Optim Mach Learn 5583
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press
Bertsekas DP (1999) Nonlinear programming. Athena Scientific Belmont,
Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Logist Quart 2(1–2):83–97
Michelot C (1986) A finite algorithm for finding the projection of a point onto the canonical simplex of \(R^n\). J Optim Theory Appl 50(1):195–200
Gold S, Rangarajan A (1996) Softmax to Softassign: neural network algorithms for combinatorial optimization. J Art Neural Netw 2(4):381–399
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Leskovec J, Kleinberg JM, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1): 2–es
Salathé M, Kazandjieva M, Lee JW, Levis P, Feldman MW, Jones JH (2010) A high-resolution human contact network for infectious disease transmission. Proc Natl Acad Sci 107(51):22020–22025
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2021) Graph contrastive learning with adaptive augmentation. In: WWW ’21: the web conference 2021, Virtual Event/Ljubljana, Slovenia, April 19–23, 2021. ACM / IW3C2, 2021, pp 2069–2080
Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp 1225–1234
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Int Math 6(1):29–123
Erdös P, Rényi A (1959) On Random Graphs I. Publicationes Mathematicae Debrecen 6:290–297
Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137
Rossetti G, Milli L, Rinzivillo, S Sîrbu A, Pedreschi D, Giannotti F (2017) NDlib: studying network diffusion dynamics. In: Proceedings of the IEEE international conference on data science and advanced analytics, DSAA, 2017, pp 155–164
Glantz SA, Slinker BK, Neilands TB (1990) Primer of applied regression and analysis of variance, vol 309. McGraw-Hill
Draper NR, Smith H (1998) Applied regression analysis, vol 326. Wiley
Acknowledgements
The authors gratefully acknowledge support by the National Science Foundation (Grants IIS-1741197, CCF-1750539) and Google via GCP credit support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gritsenko, A., Shayestehfard, K., Guo, Y. et al. Graph transfer learning. Knowl Inf Syst 65, 1627–1656 (2023). https://doi.org/10.1007/s10115-022-01782-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01782-6