Elsevier

Knowledge-Based Systems

Volume 194, 22 April 2020, 105578
Knowledge-Based Systems

Graph convolutional networks with multi-level coarsening for graph classification

https://doi.org/10.1016/j.knosys.2020.105578Get rights and content

Abstract

Graph convolutional networks (GCNs) have attracted increasing attention in recent years. Many important tasks in graph analysis involve graph classification which aims to map a graph to a certain category. However, as the number of convolutional layers increases, most existing GCNs suffer from the problem of over-smoothing, which makes it difficult to extract the hierarchical information and global patterns of graphs when learning its representations. In this paper, we propose a multi-level coarsening based GCN (MLC-GCN) for graph classification. Specifically, from the perspective of graph analysis, we develop new insights into the convolutional architecture of image classification. Inspired by this, the two-stage MLC-GCN architecture is presented. In the architecture, we first introduce an adaptive structural coarsening module to produce a series of coarsened graphs and then construct the convolutional network based on these graphs. In contrast to existing GCNs, MLC-GCN has the advantages of learning graph representations at multiple levels while preserving the local and global information of graphs. Experimental results on multiple benchmark datasets demonstrate that the proposed MLC-GCN method is competitive with the state-of-the-art graph classification methods.

Introduction

Graphs are a kind of non-Euclidean data structure for characterizing a set of objects (i.e., nodes) and their relations (i.e., edges) [1]. In practice, graphs with irregular structures naturally occur in a wide diversity of scenarios, ranging from social networks [2], [3], knowledge networks [4], [5] to protein networks [6], [7]. Many real-world applications involve the analysis of graphs, such as graph classification, node classification, node recommendation, link prediction, node visualization, etc. Particularly, graph classification aiming to predict the categories of graphs has aroused considerable research interests in diverse fields. For example, chemists could determine the toxicity or solubility of the molecule by classifying the molecule networks [8]. Biologists could find out which proteins can be used for treating diseases by classifying protein networks [9]. Computer scientists could locate bugs in codes by identifying critical subgraphs in program flow graphs [10]. Web inspectors can identify anomalies in a single snapshot by comparing it with previous snapshots while monitoring the evolution of the web graph [11]. From this perspective, it is necessary to develop effective methods for graph classification tasks. On the other hand, the complexity of various graphs brings tremendous challenges in terms of handling with graphs for graph classification.

Up to now, a surge of methods have been proposed for solving graph classification tasks. Earlier works mainly focus on kernel methods, such as graphlet kernel [12], the shortest path kernel [13] and the Weisfeiler–Lehman subtree kernel [14]. They calculate the similarity among graphs in advance by using graph properties (i.e. walks, graphlets), and utilize the supervised algorithms such as support vector machine to classify graphs into different categories. However, the construction schema of graph features is inflexible and computationally expensive. Furthermore, only using a kind of graph kernel is usually insufficient for characterizing intricate graphs. Another critical problem of kernel methods is that the feature extraction and classification are separated. The graph representation learning cannot be guided by the graph label information that plays an important role in graph classification. Inspired by the success of the convolution operation on images, Kipf et al. [15] proposed a classical Graph Convolutional Network (GCN) architecture, which utilizes a layer-wise propagation rule based on the localized first-order approximation of spectral graph convolutions to characterize graph topological structure around each node. Through optimizing the cross-entropy loss for semi-supervised node classification, the node representations can be learned. After that, many GCN architectures have been proposed for graph classification, such as DCNN [16], PATCHY-SAN [17] and DGCNN [18]. They endeavor to seek for effective graph neural networks for graph classification through learning the high-quality graph representations that not only encode the different topology structures in each graph but also preserve detail features of its nodes such as position, direction, connection, etc. When using GCNs for graph classification tasks, they can follow a theoretical paradigm of graph-based “message passing”, where the feature information of local neighbor nodes is aggregated and gradually passed on to the neighbor nodes [8]. Therefore, the suitable graph representations can be directly learned in an end-to-end manner for the ultimate goal of classification.

Although having achieved certain improvements, existing graph convolutional networks still face many challenges on graph classification tasks. (1) Over-smoothing: It has been proved in [19] that the convolution operation of the GCN model is a special form of Laplacian smoothing, resulting in the features of nodes in the same classes similar. As the number of convolutional layers increases, the features of nodes will be over-smoothed and indistinguishable, may even converge to constant values. What is more, the shallow architecture of GCN model prevents nodes from expanding the receptive field and capturing more global structural patterns. (2) Hierarchy: Many existing GCNs for graph classification such as DCNN [16] utilizes a global pooling to generate graph representations, disregarding the complex structural characteristics and hierarchies of graphs. (3) Interpretability: Many methods like PATCHY-SAN [17] and DGCNN [18] impose an order for nodes to learn graph representations, which ignore the structural similarities among nodes and lack of interpretability.

To solve the above challenges, we present a multi-level coarsening based graph convolutional network model, which is a two-stage architecture for graph classification to extract global and hierarchical information of graphs. The architecture first applies the proposed adaptive structural coarsening module to produce a series of coarsened graphs at different levels and then constructs the convolutional network based on these graphs. In the coarsening module, the second-order adjacent nodes in the graph will be merged, if they have an exact common neighbor node set. For adjacent nodes, a mutual transition probability index is introduced to describe the first-order similarities among them and quantitatively evaluate which neighboring nodes should be merged with the central node first. Through merging multiple nodes at the previous level, a supernode at the next level can represent a community or a subgraph of the previous level. Given a graph, the second-order collapsing and first-order collapsing are performed gradually to generate the coarsened graphs at different levels. Meanwhile, the node corresponding matrix indicating relationships of nodes in coarsened graphs of neighboring levels is also obtained. After obtaining the coarsened graphs at different levels, we construct the convolutional network according to the following steps. (1) Graph convolutional layers: Performing the convolution operation on different levels of graphs, the MLC-GCN model gradually extracts both local and global information of the graphs. (2) Feature aggregation: According to the node corresponding matrix, we aggregate the features of previous levels to generate input features of next levels. Performing feature aggregation of the first-order neighbors at the next level is equivalent to aggregate features of the high-order neighbors at the previous level. So MLC-GCN expands the receptive field by setting multi-levels instead of stacking more graph convolutional layers, avoiding over-smoothing. (3) Fully connected layers: In order to classify the graphs in an end-to-end manner, we use fully connected layers to transform the graph representation vectors and make predictions.

To summarize, the main contributions of this paper are highlighted as follows:

  • We propose a novel multi-level coarsening based graph convolutional network (MLC-GCN) architecture for graph classification, which is a two-stage architecture and trained in an end-to-end manner.

  • In order to extract hierarchical and global information of graphs, we introduce an adaptive structural coarsening module to generate coarsened graphs at different levels, which follows two simple node collapsing rules and is more interpretable than previous works that impose an order for nodes.

  • Experiments on eight benchmark graph datasets demonstrate that our proposed model outperforms state-of-the-art methods on graph classification tasks in most cases. We also illustrate the effectiveness of our method through feature visualization.

The remainder of this paper is organized as follows. Section 2 reviews related works on graph classification and graph coarsening. Section 3 expounds on the proposed method in detail. Next, we outline the experimental settings and present the results in Section 4. Finally, Section 5 concludes this paper and mentions future work.

Section snippets

Related works

In this section, we first briefly review graph kernel methods and graph neural networks for graph classification. Then existing graph coarsening techniques are mentioned.

Methodology

In this section, we first list the notations used in this paper and formally define the problem. Then we introduce the proposed MLC-GCN model in detail. Finally, an adaptive structural coarsening module is presented.

Experiments

In this section, we evaluate the superiority and effectiveness of the proposed MLC-GCN model on graph classification tasks. The MLC-GCN model is compared with several state-of-the-art graph kernel methods and graph neural network methods on various benchmark datasets for graph classification to demonstrate its superiority. In addition, we also perform feature visualization to illustrate the effectiveness of our method.

Conclusions

In this paper, we propose the MLC-GCN method for graph classification in a straightforward manner. Specifically, an adaptive structural coarsening module in the method is carefully designed to generate coarsened graphs and the obtained graphs at different levels characterize the hierarchical structure of the original graph. Furthermore, the node corresponding matrix can be computed based on the second-order and first-order similarities between nodes. Then the adaptive coarsening operator can be

CRediT authorship contribution statement

Yu Xie: Conceptualization, Formal analysis, Investigation, Methodology, Software, Writing - original draft. Chuanyu Yao: Conceptualization, Formal analysis, Investigation, Methodology, Software, Writing - original draft. Maoguo Gong: Conceptualization, Funding acquisition, Project administration, Resources, Methodology, Supervision. Cheng Chen: Data curation, Validation. A.K. Qin: Writing - review & editing, Supervision.

Acknowledgments

The authors wish to thank the editors and anonymous reviewers for their valuable comments and helpful suggestions which greatly improved the paper’s quality. This work was supported by the National Key Research and Development Program of China with Grant no. 2017YFB0802200, and the Australian Research Council with Grant no. LP170100416 and LP180100114.

References (41)

  • H. Cheng, D. Lo, Y. Zhou, X. Wang, X. Yan, Identifying bug signatures using discriminative graph mining, in:...
  • PapadimitriouP. et al.

    Web graph similarity for anomaly detection

    J. Internet Serv. Appl.

    (2010)
  • N. Shervashidze, S. Vishwanathan, T. Petri, K. Mehlhorn, K. Borgwardt, Efficient graphlet kernels for large graph...
  • K.M. Borgwardt, H.P. Kriegel, Shortest-path kernels on graphs, in: Proceedings of the 6th IEEE International Conference...
  • ShervashidzeN. et al.

    Weisfeiler–Lehman graph kernels

    J. Mach. Learn. Res.

    (2011)
  • T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: Proceedings of the 5th...
  • AtwoodJ. et al.

    Diffusion-convolutional neural networks

  • M. Niepert, M. Ahmed, K. Kutzkov, Learning convolutional neural networks for graphs, in: Proceedings of the 33rd...
  • M. Zhang, Z. Cui, M. Neumann, Y. Chen, An end-to-end deep learning architecture for graph classification, in:...
  • Q. Li, Z. Han, X. Wu, Deeper insights into graph convolutional networks for semi-supervised learning, in: Proceedings...
  • Cited by (0)

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2020.105578.

    View full text