Skip to main content
Log in

Meta-path-based heterogeneous graph neural networks in academic network

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Heterogeneous graph representation learning is designed to learn meaningful representation vectors from heterogeneous networks in few dimensions to extract the structure and features of the attributes of these networks. The embedding vector is the basis of and crucial to complex network analysis, and can be used in such downstream tasks as classification, clustering, link prediction, and recommendation. Key issues in heterogeneous graph neural networks pertain to ways to define heterogeneous neighbors and ways to aggregate them. Although considerable research has been devoted to homogeneous and heterogeneous network representation, the effective combination of information on the network structure and the attributes of nodes, especially effective use of meta-paths containing specific semantic information, remains rare. Here a meta-path-based heterogeneous graph neural network model is proposed. The meta-path is applied to sample the heterogeneous neighbors of each node in the network, and aggregate features of the same types of nodes to form type-related embedding. A multi-head attention mechanism is then applied to aggregate information on neighbors of different types of nodes and the model is trained by reducing context loss. Experiments on classification, clustering, link prediction, and recommendation tasks verified the validity of this model, which significantly improved the results of baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Barabasi A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509

    Article  MathSciNet  Google Scholar 

  2. Lu L, Zhou T (2011) Link prediction in complex networks: a survey. Physica A 390(6):1150

    Article  Google Scholar 

  3. Aiello LM, Barrat A, Schifanella R, Cattuto C, Markines B, Menczer F (2012) Friendship prediction and homophily in social media. ACM Trans Web 6(2):9

    Article  Google Scholar 

  4. Zitnik M, Leskovec J (2017) Predicting multicellular function through multi-layer tissue networks. Bioinformatics 33:i190

    Article  Google Scholar 

  5. Ma Y, Cheng G, Liu Z, Liang X (2017) Clustering-based link prediction in scientific coauthorship networks. Int J Mod Phys C 28:1750082

    Article  Google Scholar 

  6. Ma Y, Liang X, Huang J, Cheng G (2017) Intercity transportation construction based on link prediction. In: 2017 IEEE 29th international conference on tools with artificial intelligence (ICTAI), pp 1135–1138

  7. Piras A, Germond A (1996) Heterogeneous artificial neural network for short term electrical load forecasting. Power Syst IEEE Trans 11(1):397

    Article  Google Scholar 

  8. Cui P, Wang X, Pei J, Zhu W (2019) A survey on network embedding. IEEE Trans Knowl Data Eng 31(5):833

    Article  Google Scholar 

  9. Zhang D, Yin J, Zhu X, Zhang C (2020) Network representation learning: a survey. IEEE Trans Big Data 6(1):3

    Article  Google Scholar 

  10. Hamilton WL, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs, neural information processing systems, pp 1024–1034

  11. Ji M, Han J, Danilevsky M (2011) Ranking-based classification of heterogeneous information networks, knowledge discovery and data mining, pp 1298–1306

  12. Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2018) Graph neural networks: a review of methods and applications. http://arxiv.org/abs/Learning

  13. Yin R, Li K, Zhang G, Lu J (2019) A deeper graph neural network for recommender systems. Knowl Based Syst 185:105020

    Article  Google Scholar 

  14. Li Y, Wang H, Li J, Hong G (2013) Efficient community detection with additive constrains on large networks. Knowl Based Syst 52:268

    Article  Google Scholar 

  15. Amiri B, Hossain L, Crawford JW, Wigand RT (2013) Community detection in complex networks: multi-objective enhanced firefly algorithm. Knowl-Based Syst 46(1):1

    Article  Google Scholar 

  16. Kipf T, Welling M (2016) Semi-supervised classification with graph convolutional networks, international conference on learning representations

  17. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks, international conference on learning representations

  18. Liao L, He X, Zhang H, Chua T (2018) Attributed social network embedding. IEEE Trans Knowl Data Eng 30(12):2257

    Article  Google Scholar 

  19. Sun Y, Norick B, Han J, Yan X, Yu PS, Yu X (2013) PathSelClus: integrating meta-path selection with user-guided object clustering in heterogeneous information networks. ACM Trans Knowl Discov Data 7(3):11

    Article  Google Scholar 

  20. Zhang C, Swami A, Chawla N.V (2019) SHNE: representation learning for semantic-associated heterogeneous networks. In: Proceedings of the twelfth ACM international conference on web search and data mining, pp 690–698

  21. Zhang C, Song D, Huang C, Swami A, Chawla NV (2019) Heterogeneous graph neural network. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 793–803

  22. Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, Yu PS (2019) Heterogeneous graph attention network. In: The world wide web conference, pp 2022–2032

  23. Christou V, Tsipouras MG, Giannakeas N, Tzallas AT, Brown G (2019) Hybrid extreme learning machine approach for heterogeneous neural networks. Neurocomputing 361:137

    Article  Google Scholar 

  24. Sun Y, Han J, Yan X, Yu PS, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. Proc VLDB Endowment 4(11):992

    Article  Google Scholar 

  25. Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks, knowledge discovery and data mining, pp 135–144

  26. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111

    Google Scholar 

  27. Le QV, Mikolov T (2014) Distributed representations of sentences and documents, international conference on machine learning, pp 1188–1196

  28. Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 797–806

  29. Sun Y, Han J (2012) Mining heterogeneous information networks: principles and methodologies. Synth Lect Data Mining Knowl Discov 3(2):1

    Article  Google Scholar 

  30. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710

  31. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735

    Article  Google Scholar 

  32. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. Proc ICML 30:3

    Google Scholar 

  33. Jie T, Jing Z, Yao L, Li J, Zhong S (2008) ArnetMiner: extraction and mining of academic social networks. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining

  34. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization, international conference on learning representations

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant No. 62073333.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Guangquan Cheng or Zhong Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: detailed experimental data

Appendix: detailed experimental data

  • MP_1H meta-path-based sampling, 1 head attention, 128 dimension, use structural and textual embedding;

  • MP_2H meta-path-based sampling, 2 head attention, 128 dimension, use structural and textual embedding;

  • MP_3H meta-path-based sampling, 3 head attention, 128 dimension, use structural and textual embedding;

  • MP_4H meta-path-based sampling, 4 head attention, 128 dimension, use structural and textual embedding;

  • MP_S meta-path-based sampling, 4 head attention, 128 dimension, use structural embedding;

  • MP_T meta-path-based sampling, 4 head attention, 128 dimension, use textual embedding;

  • MP2 meta-path-based sampling (combination 2), 4 head attention, 128 dimension, use structural and textual embedding;

  • MP3 meta-path-based sampling (combination 3), 4 head attention, 128 dimension, use structural and textual embedding;

  • MP4 meta-path-based sampling (combination 4), 4 head attention, 128 dimension, use structural and textual embedding;

  • MP_W meta-path-based sampling, assign weight to neighbors, 4 head attention, 128 dimension, use structural and textual embedding;

  • MP2_W meta-path-based sampling (combination 2), assign weight to neighbors, 4 head attention, 128 dimension, use structural and textual embedding;

  • RWR random-walk-with-restart-based sampling, 4 head attention, 128 dimension, use structural and textual embedding;

  • Mix half neighbors sampled by MP and half neighbors sampled by RWR, 4 head attention, 128 dimension, use structural and textual embedding;

  • MP_E64 meta-path-based sampling, 4 head attention, 64 dimension, use structural and textual embedding;

  • MP_E32 meta-path-based sampling, 4 head attention, 32 dimension, use structural and textual embedding;

  • MP_E16 meta-path-based sampling, 4 head attention, 16 dimension, use structural and textual embedding;

  • MP2V three meta-paths are set: APA, APVPA and APPA, 128 dimension, walk length 30, number of walk per node 10;

  • HAN textual embedding, 128 dimension, three meta-paths: APA, APPA and APVPA;

  • ASNE concatenate structural and textual embedding, 128 dimension;

  • SHNE 128 dimension, walk length 30, number of walk per node 10, use textual embedding;

  • GSAGE concatenate structural and textual embedding, 128 dimension;

  • GATE concatenate structural and textual embedding, 128 dimension;

  • HetGNN combine structural and textual embedding, 128 dimension.

Task

LP/a–a/2013

LP/a–a/2012

LP/a–p/2013

LP/a–p/2012

Model

AUC

F1

AUC

F1

AUC

F1

AUC

F1

MP_1H

76.09 ± 0.11

71.35 ± 0.19

76.05 ± 0.09

71.26 ± 0.15

78.41 ± 0.13

76.61 ± 0.22

78.13 ± 0.11

76.27 ± 0.17

MP_2H

77.14 ± 0.08

73.10 ± 0.14

76.87 ± 0.08

72.72 ± 0.14

78.82 ± 0.07

77.11 ± 0.12

78.82 ± 0.11

77.09 ± 0.18

MP_3H

77.16 ± 0.10

73.16 ± 0.15

77.25 ± 0.09

72.99 ± 0.15

79.16 ± 0.09

77.44 ± 0.13

78.95 ± 0.11

77.25 ± 0.20

MP_4H

77.35 ± 0.10

72.61 ± 0.16

77.25 ± 0.12

72.51 ± 0.19

79.22 ± 0.09

77.37 ± 0.16

78.85 ± 0.09

76.99 ± 0.14

MP_S

73.83 ± 0.11

65.16 ± 0.22

73.65 ± 0.07

64.96 ± 0.13

78.11 ± 0.11

74.27 ± 0.19

78.52 ± 0.09

74.93 ± 0.15

MP_T

74.57 ± 0.08

70.55 ± 0.15

74.47 ± 0.10

70.41 ± 0.19

77.51 ± 0.14

75.95 ± 0.24

77.23 ± 0.05

75.67 ± 0.10

MP2

76.36 ± 0.11

71.12 ± 0.20

76.38 ± 0.09

71.18 ± 0.15

75.98 ± 0.06

73.32 ± 0.14

76.00 ± 0.09

73.40 ± 0.14

MP3

74.28 ± 0.12

67.83 ± 0.21

74.44 ± 0.07

68.05 ± 0.12

74.92 ± 0.09

71.27 ± 0.18

77.30 ± 0.10

74.75 ± 0.15

MP4

74.11 ± 0.11

68.17 ± 0.20

76.80 ± 0.09

74.21 ± 0.17

74.03 ± 0.09

68.08 ± 0.17

77.75 ± 0.08

75.46 ± 0.13

MP_W

73.14 ± 0.09

66.23 ± 0.13

73.02 ± 0.12

66.15 ± 0.23

76.42 ± 0.08

73.48 ± 0.12

76.89 ± 0.08

74.20 ± 0.14

MP2_W

75.60 ± 0.13

70.43 ± 0.24

75.65 ± 0.10

70.43 ± 0.17

75.62 ± 0.10

73.33 ± 0.20

76.15 ± 0.09

74.05 ± 0.14

RWR

76.26 ± 0.06

72.15 ± 0.10

76.25 ± 0.07

72.10 ± 0.10

77.78 ± 0.05

76.11 ± 0.07

77.78 ± 0.10

76.09 ± 0.17

Mix

75.89 ± 0.09

71.21 ± 0.15

75.71 ± 0.07

71.04 ± 0.12

76.41 ± 0.08

74.88 ± 0.15

76.62 ± 0.08

75.11 ± 0.16

MP_E64

74.69 ± 0.09

70.13 ± 0.15

74.79 ± 0.06

70.26 ± 0.08

77.86 ± 0.07

76.37 ± 0.11

77.91 ± 0.08

76.38 ± 0.14

MP_E32

73.17 ± 0.05

68.79 ± 0.10

73.19 ± 0.05

68.82 ± 0.10

75.83 ± 0.13

73.88 ± 0.25

75.49 ± 0.06

73.56 ± 0.13

MP_E16

72.06 ± 0.09

68.98 ± 0.18

71.75 ± 0.09

68.61 ± 0.18

75.09 ± 0.05

74.42 ± 0.13

74.93 ± 0.06

74.17 ± 0.12

MP_2H_C

75.98 ± 0.08

71.43 ± 0.15

76.17 ± 0.07

71.66 ± 0.13

78.64 ± 0.09

77.06 ± 0.14

78.59 ± 0.08

76.95 ± 0.14

MP_4H_C

75.72 ± 0.07

71.29 ± 0.12

76.01 ± 0.07

72.89 ± 0.13

78.17 ± 0.11

76.53 ± 0.18

78.27 ± 0.14

76.50 ± 0.24

HAN

71.11 ± 0.13

69.72 ± 0.25

71.24 ± 0.14

69.73 ± 0.23

-

MP2V

59.6

34.8

58.6

31.8

71.2

64.7

72.4

66.4

ASNE

36.9

64.3

67.1

61.5

72.1

71.3

72.6

73.7

SHNE

68.3

63.9

67.2

61.2

69.5

67.4

70.6

69.2

GSAGE

69.5

61.5

67.6

57.3

71.4

66.4

73.9

70.6

GAT

67.8

61.3

65.5

56

73.2

70.5

75

71.5

HetGNN

71.7

66.9

70.1

64.2

76.7

75.4

77.5

75.7

Task

Recommendation/a–v

Classification/10%

Classification/30%

Clustering

Model

Recall

Pre

f1

Macro-F1

Micro-F1

Macro-F1

Micro-F1

NMI

ARI

MP_1H

71.15 ± 0.05

31.14 ± 0.02

43.32 ± 0.03

96.41 ± 0.10

96.52 ± 0.10

96.70 ± 0.07

96.80 ± 0.06

87.03

90.63

MP_2H

72.08 ± 0.04

31.63 ± 0.01

43.97 ± 0.02

96.96 ± 0.09

97.07 ± 0.08

97.12 ± 0.14

97.24 ± 0.15

88.36

91.57

MP_3H

72.34 ± 0.05

31.78 ± 0.02

44.16 ± 0.03

96.93 ± 0.11

97.04 ± 0.12

96.94 ± 0.08

97.05 ± 0.09

87.95

91.12

MP_4H

71.61 ± 0.05

31.39 ± 0.02

43.65 ± 0.03

96.95 ± 0.10

97.05 ± 0.10

97.15 ± 0.15

97.26 ± 0.15

88.97

92.22

MP_S

70.37 ± 0.05

30.86 ± 0.02

42.90 ± 0.02

96.70 ± 0.10

96.82 ± 0.10

96.84 ± 0.16

96.96 ± 0.16

86.95

90.61

MP_T

68.18 ± 0.05

29.90 ± 0.02

41.57 ± 0.03

96.26 ± 0.11

96.36 ± 0.11

96.43 ± 0.15

96.56 ± 0.16

84.49

87.44

MP2

70.35 ± 0.05

30.79 ± 0.02

42.83 ± 0.03

96.43 ± 0.09

96.56 ± 0.09

96.58 ± 0.13

96.71 ± 0.12

87.92

91.41

MP3

71.20 ± 0.03

31.25 ± 0.01

43.44 ± 0.02

96.28 ± 0.08

96.41 ± 0.08

96.35 ± 0.13

96.48 ± 0.12

86.66

89.99

MP4

66.81 ± 0.00

29.06 ± 0.00

40.50 ± 0.00

98.60 ± 0.11

98.62 ± 0.10

98.69 ± 0.08

98.71 ± 0.08

92.92

95.44

MP_W

68.11 ± 0.03

29.80 ± 0.01

41.46 ± 0.02

97.15 ± 0.11

97.25 ± 0.10

97.27 ± 0.15

97.38 ± 0.16

87.18

90.45

MP2_W

70.77 ± 0.03

30.93 ± 0.01

43.05 ± 0.01

96.01 ± 0.16

96.16 ± 0.16

96.14 ± 0.15

96.30 ± 0.15

84.39

88.47

RWR

67.19 ± 0.33

29.41 ± 0.12

40.91 ± 0.18

97.05 ± 0.07

97.09 ± 0.07

97.14 ± 0.12

97.18 ± 0.11

88.62

92.11

Mix

70.09 ± 0.04

30.77 ± 0.02

42.77 ± 0.02

96.89 ± 0.14

96.92 ± 0.14

97.00 ± 0.13

97.14 ± 0.13

87.36

90.95

MP_E64

69.35 ± 0.05

30.48 ± 0.02

42.35 ± 0.03

96.66 ± 0.07

96.75 ± 0.07

96.70 ± 0.07

96.78 ± 0.08

88.41

91.52

MP_E32

64.31 ± 0.04

27.88 ± 0.02

38.89 ± 0.02

96.18 ± 0.08

96.37 ± 0.10

96.30 ± 0.13

96.49 ± 0.12

86.91

90.42

MP_E16

57.66 ± 0.05

25.26 ± 0.02

35.13 ± 0.03

93.11 ± 0.15

93.37 ± 0.16

93.71 ± 0.16

93.94 ± 0.16

71.69

76.22

MP_2H_C

72.27 ± 0.03

31.81 ± 0.01

44.18 ± 0.02

96.69 ± 0.07

96.81 ± 0.06

96.85 ± 0.10

96.97 ± 0.10

87.32

90.66

MP_4H_C

71.55 ± 0.04

31.43 ± 0.01

46.67 ± 0.02

96.51 ± 0.14

96.63 ± 0.15

96.65 ± 0.11

96.76 ± 0.09

86.80

90.52

HAN

97.71 ± 0.07

97.76 ± 0.07

97.84 ± 0.13

97.88 ± 0.13

89.95

93.22

MP2V

46.8

20.4

28.4

97.2

97.3

97.5

97.5

89.4

93.3

ASNE

38.2

17.1

23.6

96.5

96.7

96.9

97

85.4

89.8

SHNE

55.2

23.3

32.7

93.9

94

93.9

94.1

77.6

81.3

GSAGE

51.2

22.4

31.2

97.8

97.8

97.9

98

91.4

94.5

GAT

51.8

22.7

31.6

96.2

96.3

96.5

96.5

84.5

88.2

HetGNN

60.6

26.4

36.8

97.1

97.1

97.1

97.2

88.6

92.1

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, X., Ma, Y., Cheng, G. et al. Meta-path-based heterogeneous graph neural networks in academic network. Int. J. Mach. Learn. & Cyber. 13, 1553–1569 (2022). https://doi.org/10.1007/s13042-021-01465-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01465-8

Keywords

Navigation