Evaluation of shape classification techniques based on the signature of the blob

doi:10.1016/j.sigpro.2011.06.007

Signal Processing

Volume 92, Issue 1, January 2012, Pages 63-75

https://doi.org/10.1016/j.sigpro.2011.06.007 Get rights and content

Abstract

In this paper, we report a study of different preprocessing and classification techniques that can be applied to shape classification using the signature of the blob, or its FFT, as the main feature. Eight well-known classification methods were tested and compared.

The results obtained show that, for shapes with a small to medium amount of distortion, all the methods obtained an almost 100% success probability. However, as distortion increased, those not based on the FFT performed better than the other algorithms, at the expense of a small increase in computational time.

The samples employed for training and testing purposes were not hand-selected, but were generated by an application developed as part of this study. This application simulates the main distortions that can be produced by a real camera, including shifts, scalings, rotations, affine transformations and noise. We demonstrate that the use of these synthetic images for the training process, instead of manually selected ones, had proven to perform well with real images.

A study of the false positive problem is also included, showing that, with the use of SVMs and careful selection of the training set, a large number of false positives can be discarded in the detection step.

Highlights

► Design of an application for shape classification using image processing techniques. ► Study and comparison of different Pattern Recognition techniques in shape classification. ► Use of shape classification in computer vision application, such as Traffic Sign.

Introduction

Shape classification can be seen as the task of identifying the shape of an object according to a set of previously defined reference shapes, including, if necessary, false positives as another class. Many object recognition problems are more easily solved in terms of shape recognition than with other features, such as color or texture. In recent years, the problem has been given considerable attention, with potential applications for image processing problems such as handwriting recognition, industrial inspection or computer aided diagnosis.

The problems that must be solved are, firstly, the choice of a shape descriptor that encompasses all the relevant information concerning the shape as accurately as possible, and secondly, the design of a classifier that, using the previously defined descriptor, compares and classifies each shape into one of the predefined reference shapes. Much effort has been expended on the first step, and many approaches have been described in the literature. Nevertheless, it is worth noting that the choice of the descriptors is driven by the kind of shapes to classify.

Some of the most popular descriptors reported in the literature are the well-known Fourier Descriptors (FD) [1]. The FD are used, for instance, in [2] for the identification of American Sign Language hand shapes, or more recently, in [3], [4], for the recognition of paper and metal defects, or human silhouettes. In [5], FD were extended to become the Generic Fourier Descriptors (GFD), for the retrieval and identification of pictographs.

Another well-known shape descriptor is the Contour Curvature [6], where a one-dimensional curve can be obtained from the curvature of the object contour. Some examples can be found in [7], for instance, for the classification of general objects, or in [8], for the recognition of diatoms, a kind of unicellular algae. In [9], the curvature of the contour was further tested with occluded objects. An extension of the curvature of the contour is the Area Matrix, which is more affine-invariant than the curvature itself, and was tested, for example, in [10] for the recognition of sign language hand shapes.

Another way to describe a shape is through the use of Zernike moments. Zernike moments can be found in a number of papers, such as [11], [12], [13], [14]. In these studies, the Zernike moments were used to classify different kinds of shapes, such as traffic signs, complex pictographs and handwritten texts. [15] reports a comparative study of Zernike moments and Fourier Descriptors.

Other shape descriptors in the literature include the Level Sets. In [16], the shape descriptor of known objects, such as handwritten text, was used for the segmentation of low quality images. In [17], [18], the Level Set Function was used for the identification of different shapes in medical images, such as Magnetic Resonance or Computed Tomography scans.

Although all the descriptors mentioned so far are capable of describing complex shapes, their main disadvantage is reduced performance in the presence of high geometric distortions, especially segmentation noise. In this paper, we focus on the use of the signature as the main descriptor of the shape of the object [1]. Shape signature has not been widely reported in the literature, most probably because it cannot describe complex shapes. However, the signature has proven to be a very robust and efficient descriptor of simple shapes in the presence of noise and geometric distortions, as demonstrated here.

The research reported here represents the continuation of previous research described in [19], [20]. Although these studies were exclusively oriented to traffic sign systems, part of our efforts has been devoted to the design of a more generic algorithm, that is, to be able to process and recognize a wider range of predefined shapes. Thus, we conducted a study comparing the performance of several classifiers which has enabled us to improve the performance of the algorithm in terms of classification success. All the results were obtained with the use and processing of synthetic images generated automatically with an application which was also developed as part of this research, although the algorithm was also tested with real images. Lastly, we investigated the false positive detection problem, which enabled the algorithm to eliminate those objects that did not correspond to any of the predefined shapes, thus reducing the computational complexity of the entire system.

Section snippets

Signature extraction and preprocessing

Fig. 1 shows a block diagram of the implemented system. In this system, a segmentation block was designed to extract the objects of interest from the background: in the example given here, a red traffic sign. The block also computes the connected components, so that the output provides a list of blobs that must be processed by the next block. It is not the aim of this paper to describe the different algorithms that can be designed at this stage. A wide-ranging study of segmentation techniques

Classification using the Mean Squared Error (MSE)

In this section, we describe the use of the Mean Squared Error to compare the signature of the object being analyzed with those of the reference shapes.

Classification using the Cross-Correlation (CC)

As with the MSE, the Cross-Correlation can be seen as another tool for comparing and classifying signals. In this case, the CC is defined as ${CC}_{xy} (j) = \sum_{i = 0}^{N - 1} x_{i + j} \cdot y_{i},$ where again $x_{i + j}$ implies the circular shift of signal x. As can be seen, the definition of the CC itself implies the shift of the current signature for every possible position, in contrast to the MSE, where the shift must be explicitly included. Nevertheless, calculating the CC implies the computation of the right hand side of the

Classification using the Earth Mover's Distance (EMD)

The Earth Mover's Distance is defined as the measure of the distance between probability distributions [23]. Although this metric was originally oriented to compare probability distributions, its application can be extended to the comparison of two signals with no extra work. Intuitively, given two distributions, the EMD can be understood as the minimal amount of work needed to transform one distribution into another. The EMD can be computed according to ${EMD}_{xy} = \frac{\sum_{i = 0}^{N - 1} \sum_{j = 0}^{M - 1} f_{ij} d_{ij}}{\sum_{i = 0}^{N - 1} \sum_{j = 0}^{M - 1} f_{}}$

Classification using Support Vector Machines (SVMs)

Another way to perform classification between signals is through the use of Pattern Recognition techniques. In this section, we analyze the performance of the Support Vector Machines. Although a complete description can be found in [24], we will outline here the main points of this kind of classifiers.

In this case, the classification task consists in finding a decision frontier which separates the data into the considered classes. The simplest decision problem comprises a number of vectors

Classification using k-Nearest Neighbors (k-NN)

The k-Nearest Neighbors algorithm is amongst the simplest of all Pattern Recognition algorithms. Basically, given a set of labeled vectors, the distance from every vector to the test one is computed, and the class assigned is the one most common among the k nearest vectors, being k a positive integer, typically small [29]. It is worth noting that, if k=1, and the Euclidean distance is utilized to compute distances between vectors, then, the k-NN algorithm becomes the MSE classifier discussed

Generation of the image set

Although the entire system, including the algorithms described in this paper, can work with real images, preparing a database with a number of samples large enough to obtain statistically valid results would be unfeasible in practice. Consequently, we developed an application that generates images from a template model simulating all the possible distortions for an image of the object. The block diagram of this application is shown in Fig. 4. It basically consists of a database containing

Results and discussion

In order to test the proposed algorithms, and compare their effectiveness, the application described in Section 8 was used to generate distorted replicas of the reference shapes shown in Fig. 5. The goal of this section is twofold. On the one hand, we present a comparison of the performance of all the algorithms proposed in this paper. On the other hand, we report testing of the algorithms in the presence of different levels of noise. All the algorithms were implemented in C++ using OpenCV 2.1.

Conclusions

In this paper, we have presented a study of different classification techniques for shape classification based on the signature of the blob. Although the signature is not an appropriate tool for working with complex shapes, when working with simple ones it has proven to be very robust against typical geometric distortions, such as shifts, rotations, scalings, affine transformations and segmentation noise. The algorithm was tested with randomly generated images, using an application specifically

Acknowledgments

This work was supported by the Spanish Ministry for Science and Innovation project number TIN2010-20845-C03-03 and the Comunidad de Madrid-Universidad de Alcala project number CCG10-UAH/TIC-5965.

References (31)

W.-Y. Kim et al.
A region-based shape descriptor using Zernike moments
Elsevier Signal Processing
(2000)
J. Kim et al.
Nonparametric shape priors for active contour-based image segmentation
Elsevier Signal Processing
(2007)
A. Tsai et al.
An EM algorithm for shape classification based on level sets
Medical Image Analysis
(2005)
P. Gil-Jiménez et al.
Traffic sign shape classification and localization based on the normalized FFT of the signature of blobs and 2D homographies
Elsevier Signal Processing
(2008)
S. Maldonado-Bascón et al.
An optimization on pictogram identification for the road-sign recognition task using SVMs
Computer Vision and Image Understanding
(2010)
D. Meyer et al.
The support vector machine under test
Neurocomputing
(2003)
J.H. Min et al.
Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters
Expert Systems with Applications
(2005)
R. González et al.
Digital Image Processing
(1993)
T. McElroy et al.
Fourier descriptors and neural networks far shape classification
I. Kunttu et al.
Multiscale fourier descriptor for shape classification

Z. Ling et al.

Analyzing human movements from silhouettes via fourier descriptor

D. Zhang et al.

Enhanced generic fourier descriptors for object-based image retrieval

H. Kauppinen et al.

An experimental comparison of autoregressive and fourier-based descriptors in 2d shape classification

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1995)

D. Shen, W. him Wong, H.S. Horace, Affine-invariant image retrieval by correspondence matching of shapes, Image and...

R.E. Loke et al.

Diatom recognition by convex and concave contour curvature

Cited by (3)

Invariant multiscale triangle feature for shape recognition
2021, Applied Mathematics and Computation
Citation Excerpt :
Shape has attracted great attention in the identification of the target object because of the stability of the shape features and the invariance of light conditions and textures of an object. Shape-based object recognition is an important research direction in pattern recognition and image understanding [12,15–17, 48,57–59]. The most challenging problem is the extraction of robust and discriminative shape feature of an object, which is due to the large deformation of the shape, like the geometric changes, nonlinear deformation, and intra-class changes.
Shape is an important visual characteristic in representing an object, and it is also an important part of human visual information. Shape recognition is an important research direction in pattern recognition and image understanding. However, it is a difficult problem to extract discriminative and robust shape descriptors in shape recognition. This is because there are very large deformations in the shape, such as geometric changes, intra-class variations, and nonlinear deformations. These factors directly influence the accuracy of shape recognition. To deal with the influence of these factors on the performance of shape recognition and enhance the accuracy of recognition, we present a novel shape description method called invariant multiscale triangle feature (IMTF) for robust shape recognition. This method uses two types of invariant triangle features to obtain the shape features of an object, and it can effectively combine the boundary and the interior characteristics of a shape, and hence can increase the distinguish ability of the shape. We conducted an extensive experimental analysis on some shape benchmarks. The results indicate that our method can achieve high recognition accuracy. The superiority of our method has been further demonstrated in comparison to the state-of-the-art shape descriptors.
Time–frequency localized three-band biorthogonal wavelet filter bank using semidefinite relaxation and nonlinear least squares with epileptic seizure EEG signal classification
2017, Digital Signal Processing: A Review Journal
Citation Excerpt :
Further, the TFP of the filters of the designed three-band filter bank is better than the two-band biorthogonal filter banks designed by Manish et al. [15]. The designed time–frequency localized filter bank may find applications in image coding [21], content based image retrieval [37], fingerprint recognition [38], discrimination of 1-D signals such as: speech [39], audio signals [40,41] EEG [42], Electrocardiogram (ECG) [43–45] and classification [46–50] and segmentation [51–53] of 2-D signals with low and high frequency content. Epilepsy is a common neurological disorder which is characterized by sudden electrical disturbance in the brain which may lead to the critical situation.
In this paper, we design time–frequency localized three-band biorthogonal linear phase wavelet filter bank for epileptic seizure electroencephalograph (EEG) signal classification. Time–frequency localized analysis and synthesis low-pass filters (LPF) are designed using convex semidefinite programming (SDP) by transforming a nonconvex problem into a convex SDP using semidefinite relaxation technique. Three-band parameterized lattice biorthogonal linear phase perfect reconstruction filter bank (BOLPPRFB) is chosen and nonlinear least squares algorithm is used to determine its parameters values that generate the designed analysis and synthesis LPF such that the band-pass and high-pass filters are also well localized in time and frequency domain. The designed analysis and synthesis three-band wavelet filter banks are compared with the standard two-band filter banks like Daubechies maximally regular filter banks, Cohen–Daubechies–Feauveau (CDF) biorthogonal filter banks and orthogonal time–frequency localized filter banks. Kruskal–Wallis statistical test is employed to measure the statistical significance of the subband features obtained from the various two and three-band filter banks for epileptic seizure EEG signal classification. The results show that the designed three-band analysis and synthesis filter banks both outperform two-band filter banks in the classification of seizure and seizure-free EEG signals. The designed three-band filter banks and multi-layer perceptron neural network (MLPNN) are further used together to implement a signal classifier that provides classification accuracy better than the recently reported results for epileptic seizure EEG signal classification.
Study on the application of text classification retrieval based on complex network and ICA algorithm
2013, Journal of Multimedia

View full text

Evaluation of shape classification techniques based on the signature of the blob

Abstract

Highlights

Introduction

Section snippets

Signature extraction and preprocessing

Classification using the Mean Squared Error (MSE)

Classification using the Cross-Correlation (CC)

Classification using the Earth Mover's Distance (EMD)

Classification using Support Vector Machines (SVMs)

Classification using k-Nearest Neighbors (k-NN)

Generation of the image set

Results and discussion

Conclusions

Acknowledgments

Elsevier Signal Processing

Elsevier Signal Processing

Medical Image Analysis

Elsevier Signal Processing

Computer Vision and Image Understanding

Neurocomputing

Expert Systems with Applications

Digital Image Processing

Fourier descriptors and neural networks far shape classification

Multiscale fourier descriptor for shape classification

Analyzing human movements from silhouettes via fourier descriptor

Enhanced generic fourier descriptors for object-based image retrieval

An experimental comparison of autoregressive and fourier-based descriptors in 2d shape classification

IEEE Transactions on Pattern Analysis and Machine Intelligence

Diatom recognition by convex and concave contour curvature