A systematic analysis and guidelines of graph neural networks for practical applications

https://doi.org/10.1016/j.eswa.2021.115466Get rights and content

Highlights

  • We analyze the state-of-the-art methods of graph neural networks (GNN) in four phases.

  • 1300 combinations of methods are compared with 13 well-known benchmark datasets.

  • It results in five guidelines to use GNN for practical graph-related problems.

  • Comparative experiments with more than 3600 runs support the analysis and guidelines.

  • Experimental reproducibility and replicability are verified by comparing with the literature.

Abstract

A graph neural network (GNN) draws attention to deal with many problems in social networks and bioinformatics, as graph data proliferate in a wide variety of applications. Despite the large amount of investigation, it is still difficult to choose the most suitable method for a given problem due to the lack of a thorough analysis on the feasible methods. An anatomical comparison of GNNs would help to devise a prospective method for better solution to real-world problems. In order to give guidelines to make full use of the GNN for graph classification, this paper attempts to analyze the state-of-the-art methods of the GNN and provide practicable guidelines for applications. The representative methods are described with a systematic scheme in four phases for GNN: 1) pre-processing, 2) aggregation, 3) readout, and 4) classification with graph embedding, resulting in a large coverage of more than 1300 methods. The 13 well-known benchmark datasets are categorized into three types with respect to the properties of graph data such as connectivity. In total, more than 3600 runs are executed to systematically analyze and compare the GNN models while changing only one method for each phase. Experimental reproducibility and replicability are also verified by comparing the results with the performance from the literature. Finally, five guidelines for an appropriate model are deduced according to the graph characteristics such as complexity on connectivity and node feature.

Introduction

With the promising performance of deep learning (LeCun, Bengio, & Hinton, 2015), many studies have been conducted for various data such as image (Tan and Le, 2019, Kim and Cho, 2019a, Xie et al., 2020), text (Vaswani et al., 2017, Yang, 2019, Devlin et al., 2018), and speech (Lim et al., 2019, Sun et al., 2020a, Sun et al., 2020b). In particular, graph neural network (GNN) (Scarselli et al., 2008, Li et al., 2015, Defferrard et al., 2016, Kipf and Welling, 2016) has gained more and more attention in recent years since graph data exist in a wide diversity of real-world scenarios, e.g., social graph/diffusion graph in social networks (Freeman, 2000), citation graph in research areas, word co-occurrence networks in linguistics (Cancho and Solé, 2001), knowledge graph, chemical structure in biology (Theocharidis, van Dongen, Enright, & Freeman, 2009), and so on. Analyzing the interactions between entities as graphs provides insights into how to make good use of the information hidden in graph and enables researchers to understand the various networks in a systematic manner (Leskovec, Kleinberg, & Faloutsos, 2007). Graph analysis tasks can be broadly abstracted into a lot of applications, such as node classification (Bhagat et al., 2011, Wang et al., 2017, Chen et al., 2018a, Tian et al., 2019), link prediction (Liben-Nowell and Kleinberg, 2007, Bojchevski and Günnemann, 2017, Wei et al., 2017, Sun et al., 2019), clustering (Ding et al., 2001, Schaeffer, 2007, Nie et al., 2017, Yin et al., 2017), and graph classification (Choi et al., 2017, Ying et al., 2018, Zhang et al., 2018, Hu et al., 2020).

Although analyzing the graph data is practical and essential, most conventional methods suffer from the high computation and space cost due to the unstructured property of graph data (Cai, Zheng, & Chang, 2018). Graph embedding which aims to encode nodes, edges, or (sub)graphs into low-dimensional vectors is one of the methods to model the large-scale graph data efficiently. Obtaining such an embedding is useful in the tasks defined above (Goyal & Ferrara, 2018). Besides, the graph embedding represents structural characteristics of graph into low-dimensional space and makes it easier to extract useful information when building classifiers or other predictors (Bengio et al., 2013, Cai et al., 2018).

To quantify the graph property while preserving it in the embedded space, the first-order and second-order are usually adopted (Cai et al., 2018). The first one measures the local pairwise similarity between only the nodes connected by edges as shown in Eq. (1).L=Σei,jdfvi,fvjwhere ei,j is an edge weight between node vi and vj, d is a function measuring the distance, and f is a mapping from node to embedding space. The second one compares the similarity of the nodes’ neighborhood structures, i.e., it considers the connectivity of two nodes as shown in Eq. (2).L=Σei,jpvi|vjwherepvi|vj=expfvi·fvjΣkexpfvj·fvkwhere pvi|vj represents the probability that vi appears among vjs neighbors. Likewise, when k layers are used, it can be regarded as applying the kth-order proximity in the case of GNN.

According to the conferences on artificial intelligence including association for the advancement of artificial intelligence, international conference on learning representation, international conference on machine learning, neural information processing systems, and international joint conference on artificial intelligence, the research on graph embedding or graph representation has increased from 39 papers in 2016 to 275 papers in 2020. However, solving the tasks on the graph data is challenging in three aspects (Chen, Wang, Wang, & Kuo, 2020). First, choosing the optimal embedding dimension is not an easy task since there is a trade-off between a resource efficiency and performance (Yan et al., 2006, Yan et al., 2005; Shaw & Jebara, 2009). An embedding vector in the higher space tends to store more information of the original graph data at the cost of storage and computation. In contrast, an embedding vector of lower dimension is more resource efficient, but there is a risk in losing some information of the graph. The second issue is choosing the adequate property of graph to embed as the graph has several characteristics such as degree of nodes, node attribute, and edge attribute. The last challenge lies in that there are many graph embedding methods. Hundreds of methods associated with graph data make it difficult to choose an appropriate graph embedding technique for the target application. In addition, systematic comparative analysis of them is rare despite the large endeavor on graph-related studies. In the case of analyzing them well (Cai et al., 2018, Goyal and Ferrara, 2018, Chen et al., 2020, Errica et al., 2020, Dwivedi et al., 2020), the graph classification was performed with image data rather than actual graph data, or it was limited to categorization of the existing methods rather than detailed process analysis. However, it is well known that the systematic analysis on detailed components in the modeling process has the advantage of devising new methods and improving model performance (Chen et al., 2017, Sabour et al., 2017, Ramachandran et al., 2017, Zoph et al., 2018, Kim and Cho, 2019b).

In order to work out this problem, in this paper, we focus on the third issue and conduct a study to analyze and compare the differences in graph-based methods that show the state-of-the-art performance. For systematic analysis, we divide the graph embedding process into four phases and sort out all the existing methods. Besides, more than 3600 experiments are conducted on the six well-known benchmark datasets to evaluate the performance in a controlled and uniform manner for fair comparison. Finally, five guidelines are deduced: a method based on filtering such as graph Fourier transform is superior to other methods on datasets where connectivity is the focus and a method based on layer operations or structural changes is better for node feature modeling. The main contributions of this paper are as follows:

  • We have reviewed the state-of-the art of graph neural networks (GNNs) in the same framework composed of four phases: graph pre-processing, graph aggregation, readout and graph classification, and compare all the possible combinations of the different components with systematic and thorough experiments on the well-known benchmark datasets.

  • Several models have been evaluated with more than 3600 runs. To the best of our knowledge, it is a unique work to provide a comprehensive review on the GNNs that have profound potential to a variety of applications. Even more, we have attempted to suggest guidelines to apply them to practical problems,

  • The experiments come up with five guidelines with respect to the properties of the problem such as connectivity, node feature and applications.

  • The systematic analysis on detailed components in the modeling process might be helpful to devise appropriate methods with better performance.

The paper is organized as follows. In Section 2, we introduce the related works to model the graph data. The conventional graph embedding methods are disassembled and presented in four phases in Section 3. A fair comparison of 1320 combinations of methods in each phase are analyzed in Section 4. Lastly, we discuss conclusions and future works in Section 5.

Section snippets

Definition and notations

In this section, the notation related to the graph embedding is described in detail.

Definition 1

A graph, denoted by G=V,E, consists of vertices, V=v1,v2,,vn, and edges, E=ei,j, where an edge ei,j connects vertex vi to vj.

Graphs are defined as homogeneous graphs and heterogeneous graphs according to the number of types of nodes or types of edges (Cai et al., 2018).

Definition 2

A homogeneous graph is a graph with one node and one edge type. That is, all nodes in G belong to a single type and so are all edges.

Definition 3

A

Graph neural networks

As shown in Fig. 1, the process of GNN consists of four phases: A) graph pre-processing, B) graph aggregation, C) readout, and D) graph classification. In the first phase, graph properties from various viewpoints are expressed by creating several subgraphs or extended graphs. The node representations of each subgraph (or whole graph, extended graph) are updated through the aggregation function. The node representations updated finally are combined through the readout function to extract graph

Dataset

We evaluate the performance of the GNN models with the real-world datasets from bioinformatics and social networks. The datasets are categorized into three types according to the properties of graph data: 1) type 1: only connectivity (i.e., homogeneous graph), 2) type 2: connectivity and node attribute, and 3) type 3: connectivity, node attribute, and edge attribute. Table 2 summarizes the specification of each dataset.

PTC dataset consists of 344 chemical compounds that report the

Conclusions

A systematic analysis of the research on graph neural networks with deep learning has been conducted in this paper. We categorize the GNN process into four phases and various methods in each phase are put together, resulting in 1320 models and 13 well-known benchmark datasets. Moreover, the graph datasets are also categorized into three types according to the property of each dataset. For the possible combinations, two methods for graph pre-processing, ten methods for graph aggregation, three

CRediT authorship contribution statement

Jin-Young Kim: Methodology, Visualization, Software, Writing - original draft. Sung-Bae Cho: Conceptualization, Methodology, Investigation, Validation, Supervision, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The authors thank Tae-Yong Kong and Hyung-Joon Moon for their help to prepare for the manuscripts. This work was partially supported by an IITP grant funded by the Korean government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government (21ZS1100, Core Technology Research for Self-Improving Integrated Artificial Intelligence System).

References (119)

  • Y. Bengio et al.

    Representation learning: A review and new perspectives

    IEEE Trans. on Pattern Analysis and Machine Intelligence

    (2013)
  • S. Bhagat et al.
  • Bianchi, F. M., Grattarola, D., Alippi, C., & Livi, L. (2019). Graph neural networks with convolutional arma filters....
  • Bojchevski, A., & Günnemann, S. (2017). Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking....
  • K.M. Borgwardt et al.

    Protein function prediction via graph kernels

    Bioinformatics

    (2005)
  • H. Cai et al.

    A comprehensive survey of graph embedding: Problems, techniques, and applications

    IEEE Transactions on Knowledge and Data Engineering

    (2018)
  • R.F.I. Cancho et al.

    The small world of human language

    Royal Society of London Series B: Biological Sciences

    (2001)
  • S. Cao et al.

    Grarep: Learning graph representations with global structural information

  • Y. Chauvin et al.

    Backpropagation: Theory, architectures, and applications

    (1995)
  • L.-C. Chen et al.

    Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2017)
  • Chen, Z., Li, X., & Bruna, J. (2017). Supervised community detection with line graph neural networks. arXiv preprint...
  • Chen, J., Ma, T., & Xiao, C. (2018a). Fastgcn: fast learning with graph convolutional networks via importance sampling....
  • H. Chen et al.

    Harp: Hierarchical representation learning for networks

  • Chen, T., Bian, S., & Sun, Y. (2019). Are powerful graph neural nets necessary? A dissection on graph classification....
  • F. Chen et al.

    Graph representation learning: A survey

  • Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., et al. (2014). Learning phrase...
  • E. Choi et al.

    GRAM: graph-based attention model for healthcare representation learning

  • C. Cortes et al.

    Support-vector networks

    Machine Learning

    (1995)
  • A.K. Debnath et al.

    Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity

    Journal of Medicinal Chemistry

    (1991)
  • M. Defferrard et al.

    Convolutional neural networks on graphs with fast localized spectral filtering

  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for...
  • C.H. Ding et al.

    A min-max cut algorithm for graph partitioning and data clustering

  • K. Do et al.

    Graph transformation policy network for chemical reaction prediction

  • Dwivedi, V. P., Joshi, C. K., Laurent, T., Bengio, Y., & Bresson, X. (2020). Benchmarking graph neural networks. arXiv...
  • F. Errica et al.

    A fair comparison of graph neural networks for graph classification

  • F. Fouss et al.

    Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation

    IEEE Transacions on Knowledge and Data Engineering

    (2007)
  • L.C. Freeman

    Visualizing social networks

    Journal of Social Structure

    (2000)
  • Gao, H., & Ji, S. (2019). Graph u-nets. arXiv preprint...
  • Gehring, J., Auli, M., Grangier, D., & Dauphin, Y. N. (2016). A convolutional encoder model for neural machine...
  • J. Gilmer et al.

    Neural message passing for quantum chemistry

  • A. Grover et al.

    Node2vec: Scalable feature learning for networks. ACM SIGKDD

  • Z. Guo et al.

    A deep graph neural network-based mechanism for social recommendations

    IEEE Transacions on Industrial Informatics

    (2020)
  • W. Hamilton et al.

    Inductive representation learning on large graphs

  • Hamilton, W. L., Ying, R., & Leskovec, J. (2017b). Representation learning on graphs: Methods and applications. arXiv...
  • K. He et al.

    Deep residual learning for image recognition

  • S. Hochreiter et al.

    Long short-term memory

    Neural Computation

    (1997)
  • R. Hu et al.

    Going deep: Graph convolutional laddershape networks

  • G. Huang et al.

    Densely connected convolutional networks

  • W. Huang et al.

    Adaptive sampling towards fast graph representation learning

  • J. Jiang et al.

    Gaussian-induced convolution for graphs

  • Cited by (9)

    • Exploiting Local Information with Subgraph Embedding for Graph Neural Networks

      2023, IEEE International Conference on Data Mining Workshops, ICDMW
    • ASDEXPLAINER : An Interpretable Graph Neural Network Framework for Brain Network Based Autism Spectrum Disorder Analysis

      2023, 2023 14th International Conference on Computing Communication and Networking Technologies, ICCCNT 2023
    View all citing articles on Scopus
    1

    https://orcid.org/0000-0002-7027-2429.

    View full text