Elsevier

Signal Processing

Volume 92, Issue 1, January 2012, Pages 63-75
Signal Processing

Evaluation of shape classification techniques based on the signature of the blob

https://doi.org/10.1016/j.sigpro.2011.06.007Get rights and content

Abstract

In this paper, we report a study of different preprocessing and classification techniques that can be applied to shape classification using the signature of the blob, or its FFT, as the main feature. Eight well-known classification methods were tested and compared.

The results obtained show that, for shapes with a small to medium amount of distortion, all the methods obtained an almost 100% success probability. However, as distortion increased, those not based on the FFT performed better than the other algorithms, at the expense of a small increase in computational time.

The samples employed for training and testing purposes were not hand-selected, but were generated by an application developed as part of this study. This application simulates the main distortions that can be produced by a real camera, including shifts, scalings, rotations, affine transformations and noise. We demonstrate that the use of these synthetic images for the training process, instead of manually selected ones, had proven to perform well with real images.

A study of the false positive problem is also included, showing that, with the use of SVMs and careful selection of the training set, a large number of false positives can be discarded in the detection step.

Highlights

► Design of an application for shape classification using image processing techniques. ► Study and comparison of different Pattern Recognition techniques in shape classification. ► Use of shape classification in computer vision application, such as Traffic Sign.

Introduction

Shape classification can be seen as the task of identifying the shape of an object according to a set of previously defined reference shapes, including, if necessary, false positives as another class. Many object recognition problems are more easily solved in terms of shape recognition than with other features, such as color or texture. In recent years, the problem has been given considerable attention, with potential applications for image processing problems such as handwriting recognition, industrial inspection or computer aided diagnosis.

The problems that must be solved are, firstly, the choice of a shape descriptor that encompasses all the relevant information concerning the shape as accurately as possible, and secondly, the design of a classifier that, using the previously defined descriptor, compares and classifies each shape into one of the predefined reference shapes. Much effort has been expended on the first step, and many approaches have been described in the literature. Nevertheless, it is worth noting that the choice of the descriptors is driven by the kind of shapes to classify.

Some of the most popular descriptors reported in the literature are the well-known Fourier Descriptors (FD) [1]. The FD are used, for instance, in [2] for the identification of American Sign Language hand shapes, or more recently, in [3], [4], for the recognition of paper and metal defects, or human silhouettes. In [5], FD were extended to become the Generic Fourier Descriptors (GFD), for the retrieval and identification of pictographs.

Another well-known shape descriptor is the Contour Curvature [6], where a one-dimensional curve can be obtained from the curvature of the object contour. Some examples can be found in [7], for instance, for the classification of general objects, or in [8], for the recognition of diatoms, a kind of unicellular algae. In [9], the curvature of the contour was further tested with occluded objects. An extension of the curvature of the contour is the Area Matrix, which is more affine-invariant than the curvature itself, and was tested, for example, in [10] for the recognition of sign language hand shapes.

Another way to describe a shape is through the use of Zernike moments. Zernike moments can be found in a number of papers, such as [11], [12], [13], [14]. In these studies, the Zernike moments were used to classify different kinds of shapes, such as traffic signs, complex pictographs and handwritten texts. [15] reports a comparative study of Zernike moments and Fourier Descriptors.

Other shape descriptors in the literature include the Level Sets. In [16], the shape descriptor of known objects, such as handwritten text, was used for the segmentation of low quality images. In [17], [18], the Level Set Function was used for the identification of different shapes in medical images, such as Magnetic Resonance or Computed Tomography scans.

Although all the descriptors mentioned so far are capable of describing complex shapes, their main disadvantage is reduced performance in the presence of high geometric distortions, especially segmentation noise. In this paper, we focus on the use of the signature as the main descriptor of the shape of the object [1]. Shape signature has not been widely reported in the literature, most probably because it cannot describe complex shapes. However, the signature has proven to be a very robust and efficient descriptor of simple shapes in the presence of noise and geometric distortions, as demonstrated here.

The research reported here represents the continuation of previous research described in [19], [20]. Although these studies were exclusively oriented to traffic sign systems, part of our efforts has been devoted to the design of a more generic algorithm, that is, to be able to process and recognize a wider range of predefined shapes. Thus, we conducted a study comparing the performance of several classifiers which has enabled us to improve the performance of the algorithm in terms of classification success. All the results were obtained with the use and processing of synthetic images generated automatically with an application which was also developed as part of this research, although the algorithm was also tested with real images. Lastly, we investigated the false positive detection problem, which enabled the algorithm to eliminate those objects that did not correspond to any of the predefined shapes, thus reducing the computational complexity of the entire system.

Section snippets

Signature extraction and preprocessing

Fig. 1 shows a block diagram of the implemented system. In this system, a segmentation block was designed to extract the objects of interest from the background: in the example given here, a red traffic sign. The block also computes the connected components, so that the output provides a list of blobs that must be processed by the next block. It is not the aim of this paper to describe the different algorithms that can be designed at this stage. A wide-ranging study of segmentation techniques

Classification using the Mean Squared Error (MSE)

In this section, we describe the use of the Mean Squared Error to compare the signature of the object being analyzed with those of the reference shapes.

Classification using the Cross-Correlation (CC)

As with the MSE, the Cross-Correlation can be seen as another tool for comparing and classifying signals. In this case, the CC is defined asCCxy(j)=i=0N1xi+j·yi,where again xi+j implies the circular shift of signal x. As can be seen, the definition of the CC itself implies the shift of the current signature for every possible position, in contrast to the MSE, where the shift must be explicitly included. Nevertheless, calculating the CC implies the computation of the right hand side of the

Classification using the Earth Mover's Distance (EMD)

The Earth Mover's Distance is defined as the measure of the distance between probability distributions [23]. Although this metric was originally oriented to compare probability distributions, its application can be extended to the comparison of two signals with no extra work. Intuitively, given two distributions, the EMD can be understood as the minimal amount of work needed to transform one distribution into another. The EMD can be computed according toEMDxy=i=0N1j=0M1fijdiji=0N1j=0M1f

Classification using Support Vector Machines (SVMs)

Another way to perform classification between signals is through the use of Pattern Recognition techniques. In this section, we analyze the performance of the Support Vector Machines. Although a complete description can be found in [24], we will outline here the main points of this kind of classifiers.

In this case, the classification task consists in finding a decision frontier which separates the data into the considered classes. The simplest decision problem comprises a number of vectors

Classification using k-Nearest Neighbors (k-NN)

The k-Nearest Neighbors algorithm is amongst the simplest of all Pattern Recognition algorithms. Basically, given a set of labeled vectors, the distance from every vector to the test one is computed, and the class assigned is the one most common among the k nearest vectors, being k a positive integer, typically small [29]. It is worth noting that, if k=1, and the Euclidean distance is utilized to compute distances between vectors, then, the k-NN algorithm becomes the MSE classifier discussed

Generation of the image set

Although the entire system, including the algorithms described in this paper, can work with real images, preparing a database with a number of samples large enough to obtain statistically valid results would be unfeasible in practice. Consequently, we developed an application that generates images from a template model simulating all the possible distortions for an image of the object. The block diagram of this application is shown in Fig. 4. It basically consists of a database containing

Results and discussion

In order to test the proposed algorithms, and compare their effectiveness, the application described in Section 8 was used to generate distorted replicas of the reference shapes shown in Fig. 5. The goal of this section is twofold. On the one hand, we present a comparison of the performance of all the algorithms proposed in this paper. On the other hand, we report testing of the algorithms in the presence of different levels of noise. All the algorithms were implemented in C++ using OpenCV 2.1.

Conclusions

In this paper, we have presented a study of different classification techniques for shape classification based on the signature of the blob. Although the signature is not an appropriate tool for working with complex shapes, when working with simple ones it has proven to be very robust against typical geometric distortions, such as shifts, rotations, scalings, affine transformations and segmentation noise. The algorithm was tested with randomly generated images, using an application specifically

Acknowledgments

This work was supported by the Spanish Ministry for Science and Innovation project number TIN2010-20845-C03-03 and the Comunidad de Madrid-Universidad de Alcala project number CCG10-UAH/TIC-5965.

References (31)

  • Z. Ling et al.

    Analyzing human movements from silhouettes via fourier descriptor

  • D. Zhang et al.

    Enhanced generic fourier descriptors for object-based image retrieval

  • H. Kauppinen et al.

    An experimental comparison of autoregressive and fourier-based descriptors in 2d shape classification

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1995)
  • D. Shen, W. him Wong, H.S. Horace, Affine-invariant image retrieval by correspondence matching of shapes, Image and...
  • R.E. Loke et al.

    Diatom recognition by convex and concave contour curvature

  • Cited by (3)

    • Invariant multiscale triangle feature for shape recognition

      2021, Applied Mathematics and Computation
      Citation Excerpt :

      Shape has attracted great attention in the identification of the target object because of the stability of the shape features and the invariance of light conditions and textures of an object. Shape-based object recognition is an important research direction in pattern recognition and image understanding [12,15–17, 48,57–59]. The most challenging problem is the extraction of robust and discriminative shape feature of an object, which is due to the large deformation of the shape, like the geometric changes, nonlinear deformation, and intra-class changes.

    • Time–frequency localized three-band biorthogonal wavelet filter bank using semidefinite relaxation and nonlinear least squares with epileptic seizure EEG signal classification

      2017, Digital Signal Processing: A Review Journal
      Citation Excerpt :

      Further, the TFP of the filters of the designed three-band filter bank is better than the two-band biorthogonal filter banks designed by Manish et al. [15]. The designed time–frequency localized filter bank may find applications in image coding [21], content based image retrieval [37], fingerprint recognition [38], discrimination of 1-D signals such as: speech [39], audio signals [40,41] EEG [42], Electrocardiogram (ECG) [43–45] and classification [46–50] and segmentation [51–53] of 2-D signals with low and high frequency content. Epilepsy is a common neurological disorder which is characterized by sudden electrical disturbance in the brain which may lead to the critical situation.

    View full text