Shape recognition based on Kernel-edit distance

doi:10.1016/j.cviu.2010.07.002

Computer Vision and Image Understanding

Volume 114, Issue 10, October 2010, Pages 1097-1103

https://doi.org/10.1016/j.cviu.2010.07.002 Get rights and content

Abstract

In this paper a kernel method for shape recognition is proposed. The approach is based on the edit distance between pairs of shapes after transforming them into symbol strings. The transformation of shapes into symbol strings is invariant to similarity transforms and can handle partial occlusions. Representation of shape contours uses the shape contexts and applies dynamic programming for finding the correspondence between points over shape contours. Corresponding points are then transformed into symbolic representation and the normalized edit distance computes the dissimilarity between pairs of strings in the database. Obtained distances are then transformed into suitable kernels which are classified using support vector machines. Experimental results over a variety of shape databases show that the proposed approach is suitable for shape recognition.

Introduction

Computer vision aims at building artificial systems able to understand scenes and to recognize automatically all present objects. Objects have several properties like shape, texture, color, etc. that can used for recognition. Among those features, probably shape is the most important property which can be perceived and used for recognition and classification. Because of this property of shapes, it is useful to develop algorithms for shape recognition. One important issue for the recognition of objects is dealing with the partial occlusions. As the proposed method is mainly for the recognition of segmented shapes from the background, these shapes might come from another module which does the segmentation task. Normally, the segmentation module cannot provide perfect shapes due to occlusions, so some parts of an object might be missing or might be combined with a part of background or another object. To have a successful method for shape recognition this point has to be considered in designing an algorithm. As we will see the proposed method has considered this point and it is able to work well with partially occluded shapes.

In this paper we present a new kernel method for shape recognition based on a previously published method [1] that uses edit distance metric for the similarity measure of a pair of shapes. With this kernel method, it is possible to use support vector machine (SVM) for the classification while for the old approach nearest-neighbor classifier was used. The new kernel method performs better than previous approaches on a variety of shape databases.

Section snippets

Previous approaches

There are two main approaches for the shape recognition in the literature namely contour-based approaches, considering the outline of shapes as input for recognition and surface-based methods, considering shapes as a whole for recognition. Both approaches use local or global representation. Maybe contour-based approaches that they use local representation are the most successful methods that can deal with the problems we have mentioned in the introduction part. In addition, the approaches can

Symbolic representation for a pair of shapes

Here we describe the way we transform a pair of shape contours from a database into a string of symbols. Let suppose we have selected N points over the contour of two shapes using a sampling method. In order to be able to compute the similarity between the two shapes, we need to find the best correspondence between point sets from the two shapes. We used the shape context as a similarity measure between points from the two shapes. The shape context for each point over the contour is a polar

Distance between strings (edit distance)

As we have two sequences of strings, there are several ways to compute the distance between them. One way is using the edit distance [15]. The edit distance is a general form of Hamming distance that is applied when the length of two strings are identical. The edit distance is the minimum number of necessary operations applied to one of the strings to make it identical to the other one. Allowed operation is an insertion, deletion or substitution of a single character. We have proposed [1] to

Kernel-edit distance

Kernel methods have been successfully used for pattern recognition. The key idea of kernel methods is to map the data into a high dimensional feature space in which each coordinate corresponds to one feature of data points. In the new high dimensional space it is possible to define the mathematical operations that in the original space have not been defined. The kernel methods normally do not operate directly in the space but rather they use inner products of all pairs of the data. This makes

Experimental results

In this section we test our method for shape recognition on a variety of shape databases and compare the results with the state of the art approaches for the shape recognition, available for those database in the literature. The databases were used here are consisting of Kimia-99 database, Chicken Piece Dataset, Natural Silhouette database, Marine database, Gesture dataset, MPEG-7 shape dataset and ETH-80 object database.

Conclusions

Here a kernel approach to shape recognition was proposed. The Kernel-edit distance proposed here, was tested on a variety of shape databases including Kimia-99, Chicken Pieces, Natural Silhouettes, Gesture database, Marine database, MPEG-7 shape database and ETH-80 object database and the recognition rate was superior to all of the previous approaches that they have used these databases. The results show that the method is robust to the partial occlusions, which is a necessary part for a shape

References (39)

E. Attalla et al.
Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching
Pattern Recogn.
(2005)
T.F. Cootes et al.
Active shape models – their training and application
Comput. Vision Image Understand.
(1995)
M. Neuhaus et al.
Edit distance-based kernel functions for structural pattern classification
Pattern Recogn.
(2006)
M.R. Daliri et al.
Robust symbolic representation for shape recognition and retrieval
Pattern Recogn.
(2008)
H. Freeman
Computer processing of line-drawing images
Comput. Surv.
(1974)
C.H. Wu et al.
Run-length chain coding and scalable computation of a shape’s moments using reconfigurable optical buses
IEEE Trans. Syst. Man Cybern., Part B
(2004)
S. Belongie et al.
Shape matching and object recognition using shape contexts
IEEE Trans. PAMI
(2002)
H. Blum, A transformation for extracting new descriptors of shape, in: W. Walthen-Dunn (Ed.), Models for the Perception...
T.B. Sebastian et al.
Recognition of shapes by editing their shock graphs
IEEE Trans. PAMI
(2004)
J.J. Zou et al.
Skeletonization of Ribbon-like shapes based on regularity and singularity analyses
IEEE Trans. Syst. Man Cybern., Part B
(2001)

A. Bonnassie et al.

A new method for analyzing local shape in three-dimensional images based on medial axis transformation

IEEE Trans. Syst. Man Cybern., Part B

(2003)

H. Ling et al.

Shape classification using the inner-distance

IEEE Trans. PAMI

(2007)

X. Yang, X. Bai, L.J. Latecki, Z. Tu, Improving shape retrieval by learning graph transduction, in: Proc. European...

X. Yang, S. Koknar-Tezel, L.J. Latecki, Locally constrained diffusion process on locally densified distance spaces with...

E.G.M. Petrakis et al.

Matching and retrieval of distorted and occluded shapes using dynamic programming

IEEE Trans. PAMI

(2002)

E.S. Ristad et al.

Learning string-edit distance

IEEE Trans. PAMI

(1998)

V. Vapnik

The Nature of Statistical Learning Theory

(1995)

B. Scholkopf et al.

Nonlinear component analysis as a kernel eigenvalue problem

Neural Comput.

(1998)

S.J. Kim, A. Magnani, S. Boyd, Optimal kernel selection in kernel fisher discriminant analysis, in: Proc. of the 23rd...

Cited by (45)

Multi-level contour combination features for shape recognition
2023, Computer Vision and Image Understanding
We present a novel multi-level contour combination feature for shape recognition. This combination feature effectively solves large intra-class changes and nonlinear deformations of object shapes, thereby enhancing the performance of shape recognition. First, we divide the shape contour into two levels: the sampling points and the contour fragments, where sampling points are used to describe the detailed information of a shape and contour fragments are used to represent the global feature of a shape. Second, we employ the Fisher vector (FV) approach to encode the local sampling point feature and contour fragment feature as high-level characteristics. Finally, we combine the high-level characteristics after FV encoding and perform shape recognition through a linear support vector machine (SVM) model. The proposed method has been assessed on three benchmark shape datasets, including the Animal, MPEG-7,and ETH-80 datasets. Our method achieves 92.70%, 99.26% and 98.32% classification accuracy on the Animal, MPEG-7, and ETH-80 datasets, respectively. In addition, our method can also be applied to the classification of objects in real-word scenes. We combine the Weizmann Horse and the ETHZ Cow real-world scene datasets, and our method achieves 99.25% classification accuracy on the combined dataset. The recognition results of our approach are better than prior state-of-the-art shape recognition methods, which demonstrate the effectiveness and superiority of our approach.
Recognition of occluded objects by slope difference distribution features[Formula presented]
2022, Applied Soft Computing
Object recognition under occlusion is a key issue in computer vision. Since one can recognize an occluded object solely based on the shape, one ultimate goal of artificial intelligence is to find an automatic method that could recognize the object solely based on its shape with equal recognition accuracy. In this paper, slope difference distribution (SDD) is used to extract the shape features of the object as its sparse representation. One or several scale-invariant shape models are defined with the general SDD features for each shape class. The object is recognized based on the minimum distances between its detected SDD features and the SDD features of all the shape models. To increase the generality, we propose a two-dimensional SDD feature extraction method that computes the SDD features directly from the two-dimensional contours. Experimental results showed that the proposed object recognition method could recognize the object under significant occlusion robustly. It achieved 100% recognition and retrieval accuracy on three public datasets, Kimia99, Kimia216 and MPEG-7. For the fine-grained object classification, the proposed method achieved 90.6% accuracy on CUB-200-2011, which is also better than existing methods.
An enhanced and interpretable feature representation approach to support shape classification from binary images
2021, Pattern Recognition Letters
Shape classification from binary images is a challenging task within the computer vision community. Commonly, contour and structural features are computed to describe the objects and code patterns robust against rotation, scaling, and shape deformation. However, current techniques get a high-dimensional feature space decreasing the system performance and the attribute interpretability. Here, we introduce an enhanced and interpretable feature representation approach to support shape classification from binary images. Our method, named EIFR, employs a bag of contour fragments-based feature estimation, intrinsically robust to occlusion and shape deformation. Then, a ReliefF-based feature selection is applied to filter non-discriminative attributes. In turn, a kernel-alignment-based projection is used to measure the feature relevance enhancing the data representation through the matching between a similarity matrix computed from filtered attributes and a kernel matrix built from the shape labels. Attained results on benchmark datasets prove that EIFR improves the curvature-based features’ interpretability and favors the classification performance.
A survey of 2D shape representation: Methods, evaluations, and future research directions
2018, Neurocomputing
In the past few years, the research studies in image-based shape representation have been proliferating due to its usefulness and importance for various application. This field has been evolved, from simple descriptor-based instance retrieval to utilization of machine learning approaches. Thus, this papers aims to provide a comprehensive survey to summarize the overall view of this research topic. It covers several concepts including the traditional shape descriptors, boundary and region partitioning strategies, and more advanced techniques which commonly exist in the recent studies. This manuscript discusses the advantages and drawbacks of these methods by providing comparisons of evaluation results on well-known public datasets under the various types of similarity metrics and assessment procedures. To complete the survey, it also suggests diverse possibilities of future research directions.
A robust approach for object matching and classification using Partial Dominant Orientation Descriptor
2017, Pattern Recognition
Citation Excerpt :
Height function is also used by [21] to generate the shape descriptors, where the correspondence between shapes is performed using dynamic programming algorithm. Other studies such the ones in [22,23] have used linear interpolation to transform the shape contours into a symbolic representation, where edit-distance and kernel Support Vector Machine (SVM) are both used for classification. Furthermore, BoW concept is used by Wang et al. [24] to develop a shape representation called Bag of Contours Fragments; here, the shape classification is performed using linear SVM classifier.
This paper introduces a novel approach to measure the correspondence between objects, and exploit it for object and image classification tasks, using the proposed Partial Dominant Orientation Descriptor (PDOD). In particular, the object is represented by a set of stable and informative key locations sampled using Difference of Gaussian. The proposed PDOD at each extracted key location takes into account the position and partially computes the dominant orientation of other key locations relative to it, thus, offering a global distinctive and discriminative characterization. This allows us to learn features that are largely invariant to common image transformations, including changes in object colors and textures. The correspondence in-between two objects is performed by finding for each key location in one object the key location in the other object that has the most similar descriptor. Object classification proceeds by assigning the most relevant category that has maximally similar stored prototype objects to the query object using $k$ -Nearest Neighbors algorithm with Adaptive Object Distance. For efficiency, we further investigate PDOD for image classification by developing powerful image representations based on the popular Bag-of-Words model. The extensive experiments demonstrate that the proposed approach greatly improves the matching and classification results, while achieving the state-of-the-art performances on several challenging benchmark datasets. The obtained results suggest also broader applicability to other classification modalities.
Unseen object categorization using multiple visual cues
2017, Neurocomputing
In this paper, we propose an object categorization framework to extract different visual cues and tackle the problem of categorizing previously unseen objects under various viewpoints. Specifically, we decompose the input image into three visual cues: structure, texture and shape cues. Then, local features are extracted using the log-polar transform to achieve scale and rotation invariance. The local descriptors obtained from different visual cues are fused using the bag-of-words representation with some key contributions: (1) a keypoint detection scheme based on variational calculus is proposed for selecting sampling locations; (2) a codebook optimization scheme based on discrete entropy is proposed to choose the optimal codewords and at the same time increase the overall performance. We tested the proposed object classification framework on the ETH-80 dataset using the leave-one-object-out protocol to specifically tackle the problem of categorizing previously unseen objects under various viewpoints. On this popular dataset, the proposed object categorization system obtained a very high improvement in classification performance compared to state-of-the-art methods.

View all citing articles on Scopus

View full text

Shape recognition based on Kernel-edit distance

Abstract

Introduction

Section snippets

Previous approaches

Symbolic representation for a pair of shapes

Distance between strings (edit distance)

Kernel-edit distance

Experimental results

Conclusions

Pattern Recogn.

Comput. Vision Image Understand.

Pattern Recogn.

Robust symbolic representation for shape recognition and retrieval

Pattern Recogn.

Computer processing of line-drawing images

Comput. Surv.

Run-length chain coding and scalable computation of a shape’s moments using reconfigurable optical buses

IEEE Trans. Syst. Man Cybern., Part B

Shape matching and object recognition using shape contexts

IEEE Trans. PAMI

Recognition of shapes by editing their shock graphs

IEEE Trans. PAMI

Skeletonization of Ribbon-like shapes based on regularity and singularity analyses

IEEE Trans. Syst. Man Cybern., Part B

A new method for analyzing local shape in three-dimensional images based on medial axis transformation

IEEE Trans. Syst. Man Cybern., Part B

Shape classification using the inner-distance

IEEE Trans. PAMI

Matching and retrieval of distorted and occluded shapes using dynamic programming

IEEE Trans. PAMI

Learning string-edit distance

IEEE Trans. PAMI

The Nature of Statistical Learning Theory

Nonlinear component analysis as a kernel eigenvalue problem

Neural Comput.