A main stem concept for image matching

https://doi.org/10.1016/j.patrec.2004.10.005Get rights and content

Abstract

An original “main stem” concept for image matching is presented. The main stem is a global image feature defined as a tree of reduced components without redundant and noise components. It has been shown that this image feature is strongly invariant to different types of topological transformations and contains useful information about “meaningful” image regions and their interrelations. We present how to construct the main stem and we devise an appropriate method for image matching that is based on their stems. The method for mapping the main stem onto a feature vector and appropriate metric to compare between the feature vectors in the selected representation space are presented. Preliminary experiments show the validity of the proposed method for robust image matching.

Introduction

Image matching is the key task in image database retrieval systems. There are two main phases in image matching: definition of feature space for image representation and definition of an appropriate metric in this feature space. Image transformation into features is called image indexing. The indices (feature vectors) are used to compactly represent image content, then the matching of indices is carried out.

An image histogram is one of the many ways to index images and is used in such systems as the Query By Image Content system (Flickner et al., 1995). However, a histogram is a poor image measure, because two images may be very similar to each other even though they have completely unrelated semantics.

Other image indexing approaches are based on the Fourier, discrete cosine or wavelet transforms which extract suitable characteristics of images, then selected coefficients of these transforms can be used as image indices (e.g. Wang et al., 1997). Yet this approach has limitations in applications where high degrees of rotation and translation invariance are important. Therefore, the proposed image representation is not in accord with our visual perception.

A large number of works in image representation adopt region-based approaches. Image regions are the basic building blocks in forming the visual content of an image and as a result they have considerable potential in representing the image content and enabling image matching.

Loncaric and Dhawan (1993) have proposed a shape description method by means of Morphological Signature Transform (MST). However, this method has shown how to use a structuring element to compute the MST of the objects only. It does not solve the problem of structuring element selection. Moreover, vectors of shape parameters may be very useful for shape classification, but not as a basis for shape similarity measures. This is because, common shapes need hundreds of parameters to be represented explicitly and most of these parameters must be defined. For example, a large number of the shape detection algorithms work effectively on images with only a relatively uniform background. The majority of shape descriptions adopt such a procedure for image segmentation. Ma and Manjunath (1999) perform image retrieval that is based on segmented image regions. The segmentation procedure is not fully automated, as it requires some parametric tuning and hand pruning of regions.

New image features have been recently introduced by Zhou and Huang (2001). They have proposed structural features for content-based image retrieval, especially edge/structure features extracted from edge maps. The feature vector is extracted by means of the “water-filling algorithm” applied to an edge map of the original image. However, the heuristic assumptions used in this algorithm are the main disadvantage of this approach.

A naive Bayes algorithm to learn image categories from the blob representation in a supervised learning scheme was proposed by Carson et al. (1997). This framework suggested entails learning blob rules per category. It should be noted that each blob is represented by a histogram, thus the representation is a discrete representation in the image plane as well as in feature space. Each query image is next compared to the extracted category models and associated with the closest matching category. In essence, the image matching problem is shifted to a one or two blob matching problem.

In our approach we propose an image representation concept termed the “main stem” structure that is based on a component tree proposed by Bertrand et al. (1997). We show that this structure can be used for greyscale image indexing and matching. The “main stem” is a global image feature that is defined as a reduced component tree with redundant and “noise” components removed.

The main advantage of the proposed image representation concept is its invariance to image rotation and translation and also its insensitivity to noise. It is shown that the main stems of similar images correlate well and remain unchanged under certain transformations, such as small changes in the lighting conditions, or the angle of the scene observation view. Additionally, it can contain as much semantically meaningful and important information as is needed.

In our concept we first construct the main stem structure of an image and next generate the appropriate indices. These indices are generated by labelling stem components and building path sets of the stem structure. Images are compared and matched via an appropriate measure of similarity between image indices.

The rest of this paper is organised as follows. In Section 2 some preliminaries are given and in the following sections we will elaborate on image representation using the proposed concept and introduce a similarity measure required for image matching. The preliminary experimental results are presented in Section 5 and the conclusions appear in the last section.

Section snippets

Component tree structure

Let V denote an integer plane Z2. A grey-scale image I is defined by a function I(x, y), assuming discrete intensity values 0, 1,  , L  1 and given on x, y  V. Let Γ be an adjacency relation in V such as the 4-adjacency or the 8-adjacency.

To introduce the notion of a component tree we first need to define image level sets. In thresholding decomposition of image I the associated level set C(I, i) is a set obtained by thresholding function I(x, y), C(I,i) = {(x, y) : I(x, y)  i},  i = 0, 1,  , L  1 (e.g. Action and

A main stem concept for image description

Node and leaf components of T (I) represent main topological discontinuities of a given image I. Branch components are not essential to characterise image topology. Our goal is to define a global feature for finding “meaningful” components cC(I) of T (I) according to “meaningful” image regions (objects). The component tree obtained in such a way, must consist of a considerably smaller number of components, and should not comprise meaningless and noise components. Such components must be

Image indexing and matching

For image indexing we introduce a simplified description of MS(T). This description takes into account nodes and leaves components of the main stem only.

Proposition

Any main stem can be represented as a set P of shortest paths{pi}, i = 1, 2,  , L, from the root to each of the leaf components in an MS(T), where L is the number of all leaf components in MS(T). The path pi is a family of labels of nodes from the root to the leaf component.

In order to determine labelling procedure we first introduce notions of “

Experimental results

In order to verify the proposed concept a computer program was written in a C++ code. Details about this program are given in (Kowalski, 2004). The view of an application window is presented in Fig. 7. The developed application allows for:

  • (i)

    constructing a component tree of a grey-scale image,

  • (ii)

    defining a main stem,

  • (iii)

    counting the number of different types of components in a tree,

  • (iv)

    visualisation of leaf components (i.e. the segmented image).


The test images are 8 bits/pixel images from the USC-SIPI Image

Conclusion

A novel image matching concept in which image representation is based on the reduced component tree structure termed the main stem was proposed. In this approach images are matched via the determined similarity measure between stems. The main stem structure is universal, in the sense that it allows for identification or distinguishing objects of arbitrary shapes, i.e., no restrictions on shapes are required.

The proposed similarity measure that is based on image main stem enables two key image

References (10)

There are more references available in the full text version of this article.

Cited by (0)

View full text