Graph edit distance contest: Results and future challenges
Introduction
Computing a similarity or a dissimilarity measure between graphs is a major challenge in pattern recognition. One of the most well-known and used approaches to compute a distance between two graphs is the Graph Edit Distance (GED). Computing the GED consists in finding a sequence of graph edit operations (insertions, deletions and substitutions of vertices and edges) which transforms a graph into another with a minimal cost. However, computing the GED is NP-hard. Therefore, in the last four decades, several approaches were proposed to compute approximations in polynomial time [22].
This paper reports the results of the Graph Distance Contest (GDC) which was organized in the context of ICPR 2016. The aim of the contest was to inspect performance and effectiveness of recent methods which compute an exact or an approximate GED. The quality of the output distances as well as the execution times of the methods were used as keys for the inspection. Seven datasets were integrated, each of them being composed of several types of graphs with symbolic or numerical attributes attached to vertices and edges.
GDC was open to any method which computes a sequence of edit operations transforming a graph into another one. All the participants were required to download the datasets to prepare the submission of their programs. All the programs were executed by the organizers on the same computer. Two constraints were put in the contest. First, each submitted method could not exceed 30 s per graph comparison. Second, concerning parallel methods, the number of threads was limited to 4.
This paper is organized as follows: Section 2 describes the methods submitted to GDC. Then, Section 3 specifies the protocol and the datasets used for this contest. Obtained results are presented and discussed in Section 4. Note that a complementary and exhaustive presentation of the results is provided on GDC website http://gdc2016.greyc.fr. Last but not least, Section 5 highlights the bottlenecks of both tested methods and GED performance evaluation metrics, and proposes some possible tracks to go beyond these bottlenecks.
Section snippets
Inspected methods
Eight methods proposed by three different research groups were submitted. The beam search algorithm of Neuhaus et al. [21] was also added to the list of inspected methods. All these methods can be globally divided into three categories.
Protocol and datasets
Our evaluation is conducted on 4 Quad-Core AMD Opteron processor 8350, cadenced at 2.0GHz together with 16GB memory. The maximum number of threads was limited to 4 (i.e., for F24threads and PDFS). The time constraint used in GDC was fixed to 30 s. That is, the methods that needed more than 30 s were stopped, and the best answer found so far was outputted. Note that F2, F24threads, DF and PDFS are exact algorithms without time constraints.
Results and discussion
In this section, we present and analyze the results obtained by the submitted methods on all datasets. Table 3 summarizes the GED methods that were included in GDC.
For the sake of clarity, we synthesize our different conclusions via figures. For exhaustive and numerical results, we refer the interested reader to the contest website: http://gdc2016.greyc.fr.
Challenges in graph edit distance
In this section, we present and discuss perspectives of evaluated methods and GED challenges upcoming in the near future. Regarding the 3 inspected GED categories, the branch-and-bound based algorithms are highly dependent on the lower bound. These algorithms could be improved by proposing other promising lower bounds with the help of machine learning techniques. On the other hand, currently, the implementation of the assignment-based algorithms cannot handle graphs with numeric attributes.
The
References (29)
- et al.
Graph edit distance as a quadratic assignment problem
Pattern Recognit. Lett.
(2017) - et al.
New binary linear programming formulation to compute the graph edit distance
Pattern Recognit.
(2017) - et al.
Approximate graph edit distance computation by means of bipartite graph matching
Image Vision Comput.
(2009) Fast computation of bipartite graph matching
Pattern Recognit. Lett
(2014)- et al.
Graph edit distance: moving from global to local structure to solve the graph-matching problem
Pattern Recognit. Lett.
(2015) - et al.
A graph database repository and performance evaluation metrics for graph edit distance
Graph-Based Representations in Pattern Recognition
(2015) - et al.
An exact graph edit distance algorithm for solving pattern recognition problems
ICPRAM
(2015) - et al.
A Parallel Graph Edit Distance Algorithm
Technical Report hal-01476393
(2017) - et al.
A hungarian algorithm for error-correcting graph matching
IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition
(2017) - et al.
Graph edit distance as a quadratic program
International Conference on Pattern Recognition
(2016)
Approximate graph edit distance computation combining bipartite matching and exact neighborhood substructure distance.
Graph-Based Representations in Pattern Recognition
Active-learning query strategies applied to select a graph node given a graph labelling
IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition
Learning graph matching substitution weights based on the ground truth node correspondence
Int. J. Pattern Recognit. Artif. Intell.
Approximate graph edit distance guided by bipartite matching of bags of walks
Structural, Syntactic, and Statistical Pattern Recognition
Cited by (34)
Two parallel versions of VF3: Performance analysis on a wide database of graphs
2021, Pattern Recognition LettersCitation Excerpt :However, on the one hand the need to process larger and larger graphs, requiring days or weeks to be explored also by state-of-the-art algorithms like VF3, on the other the pervasivity of multi-processor, multi-core architectures, make desirable to have parallel implementations that efficiently exploit the available computing power. A related graph matching problem for which several parallel solutions have been recently proposed is the computation of the Graph Edit Distance [1,2,8,36]; in that case, there are several approximated versions of the problem that are expressed in terms of matrix operations, and so lend themselves better to the exploitation of data parallelism. Unfortunately this is not the case for subgraph isomorphism, at least if an exact solution is desired, as it happens in several of its applications.
On the unification of the graph edit distance and graph matching problems
2021, Pattern Recognition LettersFast linear sum assignment with error-correction and no cost constraints
2020, Pattern Recognition LettersGraphs in pattern recognition: successes, shortcomings, and perspectives
2023, Journal of Electronic ImagingBridging Distinct Spaces in Graph-Based Machine Learning
2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Experimental Comparison of Graph Edit Distance Computation Methods
2023, Proceedings - IEEE International Conference on Mobile Data Management