Elsevier

Pattern Recognition

Volume 44, Issue 9, September 2011, Pages 1928-1940
Pattern Recognition

Improving vector space embedding of graphs through feature selection algorithms

https://doi.org/10.1016/j.patcog.2010.05.016Get rights and content

Abstract

Graph based pattern representation offers a versatile alternative to vectorial data structures. Therefore, a growing interest in graphs can be observed in various fields. However, a serious limitation in the use of graphs is the lack of elementary mathematical operations in the graph domain, actually required in many pattern recognition algorithms. In order to overcome this limitation, the present paper proposes an embedding of a given graph population in a vector space Rn. The key idea of this embedding approach is to interpret the distances of a graph g to a number of prototype graphs as numerical features of g. In previous works, the prototypes were selected beforehand with heuristic selection algorithms. In the present paper we take a more fundamental approach and regard the problem of prototype selection as a feature selection or dimensionality reduction problem, for which many methods are available. With several experiments we show the feasibility of graph embedding based on prototypes obtained from such feature selection algorithms and demonstrate their potential to outperform previous approaches.

Introduction

After decades of focusing on independent and identically distributed representation formalisms in pattern recognition, more and more effort is now rendered in various research fields on graph based representation [1]. Object representation by means of graphs is advantageous compared to vectorial approaches because of two reasons. First, graphs do not suffer from the constraint of fixed dimensionality. That is, the number of nodes and edges in a graph is not limited a priori and depends on the size and the complexity of the actual object to be modeled. Second, graphs are able to represent not only the values of object properties, i.e. features, but can be used to explicitly model relations that exist between different parts of an object.

Due to their ability to represent properties of entities and binary relations at the same time, graphs have found widespread applications in science and engineering. They are used, for instance, in bioinformatics and chemoinformatics [2], [3], [4], web content and data mining [5], [6], [7], classifying images from various fields [8], [9], [10], symbol and character recognition [11], [12], [13], and in computer network analysis [14], [15].

However, one drawback of graphs, when compared to feature vectors, is the significantly increased complexity of many algorithms. For example, the comparison of two feature vectors for identity can be accomplished in linear time with respect to the length of the two vectors. For the analogous operation on general graphs, i.e. testing two graphs for isomorphism, only exponential algorithms are known today. Another serious limitation in the use of graphs for pattern recognition tasks is the little mathematical structure in the domain of graphs. For example, computing the (weighted) sum or the product of a pair of entities (which are elementary operations required in many classification and clustering algorithms) is not possible in the domain of graphs, or is at least not defined in a standardized way. Due to these general problems in the graph domain, we observe a lack of algorithmic tools for graph based pattern recognition.

The present paper's objective is to benefit from both the universality of graphs for pattern representation and the computational convenience of vectors for pattern recognition. To this end we propose a general procedure for mapping graphs from arbitrary graph domains to a real vector space by means of functions φ:GRn. Based on the resulting graph maps, the considered pattern recognition task is eventually carried out. Hence, the whole arsenal of algorithmic tools readily available for vectorial data can be applied to graphs (more exactly to graph maps φ(g)Rn).

The presented approach for graph embedding is primarily based on the idea proposed in [16] where the dissimilarity representation for pattern recognition in conjunction with feature vectors was first introduced. In the current work we go one step further and generalize the methods described in [16] to the domain of graphs. The key idea of the novel graph embedding approach is to use the distances of an input graph g to n prototype graphs P={p1,,pn} as a vectorial description of g. Apparently, the definition of the prototype set P is a critical issue since the graphs in P affect the resulting vectors. Thus, a good selection of prototypes is crucial to succeed with the algorithm to be applied in the embedding space. Commonly, the prototypes are selected from a training set T of existing graphs before the embedding is carried out. In previous works, this prototype selection uses some heuristics based on distances between the members of T [17]. In the present paper, however, a new approach is proposed where all available elements from the training set are used as prototypes in a first step, i.e. for our embedding we define P=T. Subsequently, feature subset selection algorithms are applied to the vector space embedded graphs. In other words, rather than selecting the prototypes beforehand, the embedding is carried out first and then the problem of prototype selection is reduced to a feature selection problem. Thus, by means of this more fundamental approach, we bypass the difficult problem of selecting adequate prototypes.

A preliminary version of the current paper appeared in [18]. The current paper has been significantly extended with respect to the underlying methodology and the experimental evaluation. First, two additional feature selection algorithms are applied to vector space embedded graphs. In addition, and for the sake of completeness, the results of two dimensionality reduction algorithms are also included in the experimental evaluation (originally presented in [19], [20] for the first time). The number of data sets where our embedding procedure is tested on is also considerably increased and results of an additional reference system, a similarity kernel, are added. Finally, a detailed discussion and results about the validation of the meta parameters of our approach are provided.

The remainder of this paper is organized as follows. In the next section we define basic concepts and introduce our notation. Then, in Section 3, the proposed approach for graph embedding in real vector spaces is described. The feature selection algorithms applied to the vector space embedded graphs are described in Section 4. In Section 5 we report a number of experiments and present results achieved with our embedding method. Finally, in Section 6 we summarize our work and draw conclusions.

Section snippets

Basic terminology

Depending on the considered application, various definitions for graphs can be found in the literature. The following well-established definition is sufficiently flexible for a large variety of tasks.

Definition 1 Graph

Let LV and LE be finite or infinite label sets for nodes and edges, respectively. A graph g is a four-tuple g=(V,E,μ,ν), where V is the finite set of nodes, EV×V is the set of edges, μ:VLV is the node labeling function, and ν:ELE is the edge labeling function.

The number of nodes of a graph g is

General embedding procedure and properties

The idea of our graph embedding framework stems from the seminal work done by Duin and Pekalska [16] where dissimilarities for pattern representation are used for the first time. Later this method was extended so as to map string representations into vector spaces [28]. In the current work we go one step further and generalize and substantially extend the methods described in [16], [28] to the domain of graphs. The key idea of this approach is to use the distances of an input graph to a number

Feature selection algorithms

Feature subset selection aims at selecting a suitable subset of features such that the performance of a given algorithm is improved [32], [33]. By means of forward selection search strategies, the search starts with an empty set and iteratively adds useful features to this set. Conversely, backward elimination refers to the process of iteratively removing useless features starting with the full set of features. Also floating search methods are available, where alternately useful features are

Experimental evaluation

The purpose of the experimental evaluation described in this chapter is to empirically verify the power and applicability of the proposed graph embedding framework. To this end, several classification tasks are carried out using vector space embedded graphs.

Conclusions and future work

For objects given in terms of feature vectors a rich repository of algorithmic tools for classification has been developed over the past decades. Graphs are a versatile alternative to feature vectors, and are known to be a powerful and flexible representation formalism. The representational power of graphs is due to their ability to represent not only feature values but also relationships among different parts of an object, and their flexibility comes from the fact that there are no size or

Acknowledgement

This work has been supported by the Swiss National Science Foundation (Project 200021-113198/1).

About the Author—HORST BUNKE received his M.S. and Ph.D. degrees in Computer Science from the University of Erlangen, Germany. In 1984, he joined the University of Bern, Switzerland, where he is a professor in the Computer Science Department. He was Department Chairman from 1992 to 1996, Dean of the Faculty of Science from 1997 to 1998, and a member of the Executive Committee of the Faculty of Science from 2001 to 2003. Horst Bunke served as 1st Vice-President of the International Association

References (52)

  • A. Schenker et al.

    Graph-Theoretic Techniques for Web Content Mining

    (2005)
  • A. Schenker et al.

    Classification of web documents using graph matching

    International Journal of Pattern Recognition and Artificial Intelligence

    (2004)
  • D. Cook, L. Holder (Eds.), Mining Graph Data, Wiley-Interscience,...
  • Z. Harchaoui et al.

    Image classification with segmentation graph kernels

  • B. Luo et al.

    Spectral embedding of graphs

    Pattern Recognition

    (2003)
  • R. Ambauen et al.

    Graph edit distance with node splitting and merging and its application to diatom identification

  • J. Lladós et al.

    Graph matching versus graph parsing in graphics recognition

    International Journal of Pattern Recognition and Artificial Intelligence

    (2004)
  • J. Rocha et al.

    A shape analysis model with applications to a character recognition system

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1994)
  • H. Bunke, P. Dickinson, M. Kraetzl, W. Wallis, A graph-theoretic approach to enterprise network dynamics, Progress in...
  • P. Dickinson et al.

    Matching graphs with unique node labels

    Pattern Analysis and Applications

    (2004)
  • E. Pekalska et al.

    The Dissimilarity Representation for Pattern Recognition: Foundations and Applications

    (2005)
  • K. Riesen et al.

    Graph classification based on vector space embedding

    International Journal of Pattern Recognition and Artificial Intelligence

    (2009)
  • K. Riesen et al.

    Feature ranking algorithms for improving classification of vector space embedded graphs

  • K. Riesen et al.

    Reducing the dimensionality of vector space embeddings of graphs

  • K. Riesen et al.

    Non-linear transformations of vector space embedded graphs

  • A. Sanfeliu et al.

    A distance measure between attributed relational graphs for pattern recognition

    IEEE Transactions on Systems, Man, and Cybernetics (Part B)

    (1983)
  • Cited by (42)

    • Altered dynamic electroencephalography connectome phase-space features of emotion regulation in social anxiety

      2019, NeuroImage
      Citation Excerpt :

      In this way, one can obtain an efficient classifier with minimal degradation in classification accuracy and performance. As two concrete examples, the spanning prototype selector (Bunke and Riesen, 2011) (SPS) was proposed such that each additional prototype selected is the graph that is the furthest away from already selected prototype graphs (with the first graph selected being the median graph, defined as the graph whose sum of geodesic distances to all other graphs is minimum); The center prototype selector (Bunke and Riesen, 2011) (CPS), on the other hand, selected the most central graphs as prototypes, which are recursively defined by the median graph from the remaining graph set. Last, informed by the MST construction we propose one additional approach to prototype selection which we termed the MST prototype selector.

    • Semantic content-based image retrieval: A comprehensive study

      2015, Journal of Visual Communication and Image Representation
    View all citing articles on Scopus

    About the Author—HORST BUNKE received his M.S. and Ph.D. degrees in Computer Science from the University of Erlangen, Germany. In 1984, he joined the University of Bern, Switzerland, where he is a professor in the Computer Science Department. He was Department Chairman from 1992 to 1996, Dean of the Faculty of Science from 1997 to 1998, and a member of the Executive Committee of the Faculty of Science from 2001 to 2003. Horst Bunke served as 1st Vice-President of the International Association for Pattern Recognition (IAPR) from 1998 to 2000. In 2000 he also was Acting President of this organization. Horst Bunke is a Fellow of the IAPR, former Editor-in-Charge of the International Journal of Pattern Recognition and Artificial Intelligence, Editor-in-Chief of the journal Electronic Letters of Computer Vision and Image Analysis, Editor-in-Chief of the book series on Machine Perception and Artificial Intelligence by World Scientific Publ. Co., Advisory Editor of Pattern Recognition, Associate Editor of Acta Cybernetica and Frontiers of Computer Science in China, and Former Associate Editor of the International Journal of Document Analysis and Recognition, and Pattern Analysis and Applications. Horst Bunke received an honorary doctor degree from the University of Szeged, Hungary, and held visiting positions at the IBM Los Angeles Scientific Center (1989), the University of Szeged, Hungary (1991), the University of South Florida at Tampa (1991, 1996, 1998–2006), the University of Nevada at Las Vegas (1994), Kagawa University, Takamatsu, Japan (1995), Curtin University, Perth, Australia (1999), and Australian National University, Canberra (2005). He served as a co-chair of the 4th International Conference on Document Analysis and Recognition held in Ulm, Germany, 1997 and as a Track Co-Chair of the 16th and 17th International Conference on Pattern Recognition held in Quebec City, Canada, and Cambridge, UK, in 2002 and 2004, respectively. Also he was chairman of the IAPR TC2 Workshop on Syntactic and Structural Pattern Recognition held in Bern 1992, a cochair of the 7th IAPR Workshop on Document Analysis Systems held in Nelson, NZ, 2006, and a cochair of the 10th International Workshop on Frontiers in Handwriting Recognition, held in La Baule, France, 2006. Horst Bunke was on the program and organization committee of many other conferences and served as a referee for numerous journals and scientific organizations. He is on the Scientific Advisory Board of the German Research Center for Artificial Intelligence (DFKI). Horst Bunke has more than 550 publications, including 36 authored, co-authored, edited or co-edited books and special editions of journals.

    About the Author—KASPAR RIESEN received his M.S. and Ph.D. degrees in Computer Science from the University of Bern, Switzerland, in 2006 and 2009, respectively. Currently he is a researcher and lecture assistant in the research group of Computer Vision and Artificial Intelligence at the University of Bern, Switzerland. His research interests include structural pattern recognition and in particular graph embeddings in real vector spaces. He has more than 30 publications, including six journal papers.

    View full text