A Scale-Space Approach for Multiscale Shape Analysis

Ramos, Lucas Alexandre; Marana, Aparecido Nilceu; de Souza Junior, Luis Antônio

doi:10.1007/978-3-319-75193-1_65

A Scale-Space Approach for Multiscale Shape Analysis

Conference paper
First Online: 04 February 2018

2055 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10657))

Abstract

Currently, given the widespread of computers through society, the task of recognizing visual patterns is being more and more automated, in particular to treat the large and growing amount of digital images available. Two well-referenced shape descriptors are BAS (Beam Angle Statistics) and MFD (Multiscale Fractal Dimension). Results obtained by these shape descriptors on public image databases have shown high accuracy levels, better than many other traditional shape descriptors proposed in the literature. As scale is a key parameter in Computer Vision and approaches based on this concept can be quite successful, in this paper we explore the possibilities of a scale-space representation of BAS and MFD and propose two new shape descriptors SBAS (Scale-Space BAS) and SMFD (Scale-Space MFD). Both new scale-space based descriptors were evaluated on two public shape databases and their performances were compared with main shape descriptors found in the literature, showing better accuracy results in most of the comparisons.

L. A. Ramos—The authors thank São Paulo Research Foundation (FAPESP) for the financial support (grant 2014/10611-0).

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Shape analysis is considered one of the most important areas in image analysis, since it is able to describe an object preserving its most relevant information. Shape classification is a fundamental problem in Computer Vision, which has many applications such as, object detection, classification and image retrieval.

When analyzing an image, it is of utmost importance to consider that certain information only makes sense under certain viewing conditions, such as the scale. Nevertheless, choosing the appropriate scales of observation is not a trivial task, which motivated the development of scale-space filters used to create a multiscale image representation [5, 15]. Some multiscale shape descriptor that can be found in the literature are Multiscale Fractal Dimension [2], Multiscale Hough Transform Statistics [10], Multiscale Fourier Descriptor [4], and Curvature Scale-Space [14].

It is well known that multiscale/multiresolution methods are regarded to be consistent with plausible models of the human visual system, therefore, approaches based on this concept can be promising [9]. In this paper, we proposed two new shape descriptors: SBAS, a scale-space version of BAS (Beam Angle Statistics) proposed by [1], and SMFD, a scale-space version of MFD (Multiscale Fractal Dimension) proposed by [2]. Experimental results obtained on two public shape database images showed that both SBAS and SMFD presented better recognition rates than many traditional shape description methods, such as: Zernike Moments [13], BAS [1], HTS (Hough Transform Statistics) [12], MFD [2], TS (Tensor Scale) [7], FD (Fourier Descriptors) [16] and CS (Contour Salience) [14].

2 Scale-Space BAS and MFD

In scale-space theory, the characteristics of interest describe a continuous path in the representation, allowing a consistent manipulation of structures present in different scales. One of the main reasons to represent information at multiple levels is that the successive simplification removes unwanted details, such as noise and insignificant structures. Also, the scale reduction is directly related with information reduction, which implies in a reduction in processing time and increase in computational efficiency [15]. An important property of the scale-space theory is that the transformation to a coarser level should not introduce new structures, thus, structures present in a coarser scale should be present in all others refined scales [15].

In this paper, the image representation in coarse scales was performed by convolving the image with a low-pass filter, the 2D Gaussian kernel, defined by Eq. 1.

$$\begin{aligned} G(x,y)=\frac{1}{2\pi \sigma ^2}e^{-\frac{x^2+y^2}{2\sigma ^2}}. \end{aligned}$$

(1)

where $\sigma $ is the standard deviation of the Gaussian distribution. The $\sigma $ value represents the 2D Gaussian kernel’s width, and is referred here as the scale parameter. The convolution with the Gaussian kernel causes blurring of the image’s border, reducing the information and creating a coarser level of it as the scale ($\sigma $) increases. For each scale i, the scale parameter is defined by $\sigma =2^i$. Figure 1 illustrates the process of Gaussian kernel convolution. One can notice that the higher the value of $\sigma $, the coarser the image and the blurrier the borders.

After the convolution, the resulting images are binarized as shown in Fig. 2, so their border can be extracted. Once the borders are blurred and the images are binarized the BAS and MFD methods can be applied for each image from each scale and the feature vectors of each scale are concatenated in a single feature vector making the new SBAS and SMFD shape descriptors, respectively.

3 Experimental Results

In order to evaluate the performance of the proposed methods, SBAS and SMFD, they were applied on two public and well known shape datasets: Kimia-216 [11] and MPEG-7 Part B [3]. Then, their results were compared with results obtained with some well-referenced shape description methods: Zernike Moments [13], BAS [1], HTS (Hough Transform Statistics) [12], MFD [2], TS (Tensor Scale) [7], FD (Fourier Descriptors) [16], and CS (Contour Salience) [14].

The performance comparisons were based on the following metrics:

Precision x Recall: :: The precision is the fraction of retrieved instances that are relevant, while the recall is the fraction of relevant instances that are retrieved [8];
Multiscale Separability: :: The Multiscale Separability indicates how clusters of different classes are distributed in the feature space. The more separated the clusters, the better is the descriptor [14];
Bulls-Eye Score: :: The Bulls-Eye score is calculated as follows: given a dataset $(S_c{}_,{}_n)$, where c is the number of classes in S and n the number of images per class, each image in $(S_c{}_,{}_n)$ is used as a query and the number of correct images in the top 2n matches is computed. A perfect score is achieved when $c.n^2$ positive cases are found across all the dataset [6].

3.1 Experiments on MPEG-7 Part B Shape Dataset

The MPEG-7 Part B dataset is composed by 1400 images divided into 70 classes of 20 images each, with white silhouette and black background.

The Precision x Recall curves for the MPEG-7 Part B dataset obtained with the proposed shape descriptors (SBAS and SMFD), and with the other shape descriptors (Zernike Moments, BAS, HTS, MFD, TS, FD, and CS) are presented in Fig. 3. One can observe that SBAS presented the best Precision x Recall results (highest curve), followed by its monoscale version BAS. Although the SMFD did not present top results, its performance was better than the results of its monoscale version (MFD).

For the shape descriptors that presented the best Precision x Recall curves (SBAS, BAS, Zernike Moments, SMFD, HTS, and MFD), we also calculated their Multiscale Separability curves. Figure 4 presents such curves. One can observe that SBAS also presented the best result according to this measure. The SMFD and Zernike Moments presented very similar results, both outperforming BAS, MFD and HTS.

Finally the Bulls-Eye score was calculated for the methods that presented the best performances according to the Precision x Recall and Multiscale separability results. Table 1 presents the Bulls-Eye score for each method. One can observe that also for this measure SBAS presented the best results, followed by its monoscale version BAS [1] and Zernike Moments [13]. The SMFD method showed better results than its monoscale version MFD [2] and than HTS [12].

From all these results we conclude that the proposed scale-space approach for shape recognition significantly improved the accuracy of the monoscale shape recognition approach.

Table 1. Bulls-Eye scores for each method using the MPEG-7 Part B dataset [3].

Full size table

3.2 Experiments on Kimia-216 Shape Dataset

The Kimia-216 shape dataset [11] is composed by 216 images divided into 18 classes of 12 images each. This dataset is simpler than MPEG-7 Part B, because it is composed by fewer classes, and they do not present many transformations as the MPEG-7 Part B classes do.

The Precision x Recall curves for the Kimia-216 dataset obtained with the proposed shape descriptors (SBAS and SMFD), and with the other shape descriptors (Zernike Moments, BAS, HTS, MFD, TS, FD, and CS) are presented in Fig. 5. One can observe that SBAS presented best Precision x Recall results for most parts of the curve, followed closely by its monoscale version BAS. Although the SMFD did not present top results, its performance in this dataset was also improved when compared to its monoscale version MFD, likewise in MPEG-7 Part B dataset.

For the shape descriptors that presented the best Precision x Recall curves (SBAS, BAS, Zernike Moments, HTS, SMFD and MFD), we also calculated their Multiscale Separability curves. Figure 6 presents such curves. One can observe that SBAS also presented the best results according to this measure, followed by Zernike Moments. The SMFD did not present the top results, but it was significantly better than BAS and MFD.

Finally the Bulls-Eye score was calculated for the methods that presented the best performances according to the Precision x Recall and Multiscale separability results. Table 2 presents the Bulls-Eye score from each method. One can observe that also for this measure SBAS presented the best results, followed by its monoscale version BAS. The SMFD method showed better results than its monoscale version MFD.

Likewise in the MPEG-7 Part B dataset, from all the results obtained on Kimia-216 dataset, we conclude that the proposed scale-space approach for shape recognition significantly improved the accuracy of the monoscale shape recognition approach.

Table 2. Bulls-Eye scores for each method using the Kimia-216 dataset [11].

Full size table

4 Discussion and Conclusion

In this paper we presented two new shape description methods, SBAS and SMFD. Experiments carried out on MPEG-7 Part B dataset showed that the SBAS presented the best results among several well-referenced shape description methods in the literature, such as: Zernike Moments [13], BAS [1], HTS (Hough Transform Statistics) [12], MFD [2], TS (Tensor Scale) [7], FD (Fourier Descriptors) [16] and CS (Contour Salience) [14], for all three evaluation metrics used in this work (Precision x Recall, Multiscale Separability and Bulls-Eye). While the SMFD did not present so good results, it performed better than its monoscale version, the MFD shape descriptor.

Regarding the results obtained with the Kimia-216 dataset, SBAS presented the best Multiscale Separability results and Bulls-Eye score. The SMFD also presented better results than its monoscale version, but it did not outperform other methods. It is important to notice that the Kimia-216 dataset is a simpler dataset than MPEG-7 Part B and the results obtained by the methods in the Kimia-216 dataset are already very good, making it harder to obtain relevant improved results.

From the obtained results, one can observe that the new descriptor SBAS showed better results than all methods compared in this paper, improving the monoscale version of BAS [1] accuracy in approximately 5.3% according to the Bulls-Eye score for the MPEG-7 Part B dataset, and in 0.65% for the Kimia-216 dataset.

Therefore, the results obtained in this paper suggests that the proposed scale-space approach for shape recognition can significantly improve the accuracy of any shape description method already proposed in literature that does not explores the scale-space. In this work we assessed the BAS and MFD shape descriptors. However, since the proposed multiscale approach is applied in the pre-processing stage, it can be applied in any other shape description method.

References

Arica, N., Vural, F.T.Y.: BAS: a perceptual shape descriptor based on the beam angle statistics. Patt. Recogn. Lett. 24(9–10), 1627–1639 (2003)
Article MATH Google Scholar
Florindo, J.B., Backes, A.R., Castro, M., Bruno, O.M.: A comparative study on multiscale fractal dimension descriptors. Pattern Recog. Lett. 33(6), 768–806 (2012)
Article Google Scholar
Jeannin, S., Bober, M.: Description of Core Experiments for MPEG-7 Motion/Shape (1999)
Google Scholar
Kunttu, L., Lepist, L., Rauhamaa, J., Visa, A.: Multiscale fourier descriptor for shape-based image retrieval. In: 17th ICPR, pp. 765–768 (2004)
Google Scholar
Lindeberg, L.: Scale-Space Theory in Computer Vision. Kluwer Academic Publishers, Norwell (1994)
Book MATH Google Scholar
Liu, M., Vemuri, B.C., Amari, S., Nielsen, F.: Shape retrieval using hierarchical total Bregman soft clustering: overview and proposals. IEEE Trans. Pattern Anal. Mach. Learn. 34(12), 2407–2419 (2012)
Article Google Scholar
Miranda, P.A.V., Torres, R.S., Falcão, A.X.: TSD: a shape descriptor based on a distribution of tensor scale local orientation. In: 27th SIBGRAPI, pp. 139–146 (2005)
Google Scholar
Müller, H., Müller, W., Squire, D.M., Maillet, S.M., Pun, T.: Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recog. Lett. 22(5), 593–601 (2001)
Article MATH Google Scholar
Phongsuphap, S., Takamatsu, R., Sato, M.: Multiscale image analysis through the surface shape operator. J. Electron. Imaging 9(3), 305–316 (2000)
Article Google Scholar
Ramos, L.A., de Souza, G.B., Marana, A.N.: Shape analysis using multiscale hough transform statistics. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. LNCS, vol. 9423, pp. 452–459. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25751-8_54
Chapter Google Scholar
Sebastian, T., Klein, P., Kimia, B.: Recognition of shapes by editing their shock graphs. IEEE Trans. PAMI 26(5), 550–571 (2004)
Article Google Scholar
Souza, G.B., Marana, A.N.: HTS and HTSn: new shape descriptors based on Hough transform statistics. CVIU 127, 43–56 (2014)
Google Scholar
Teague, M.R.: Image analysis via the general theory of moments. J. Opt. Soc. Am. 70(8), 920–930 (1980)
Article MathSciNet Google Scholar
Torres, R.S., Falco, A.X.: Contour salience descriptors for effective image retrieval and analysis. Image Vision Comput. 25, 3–13 (2007)
Article Google Scholar
Witkin, A.: Scale-space filtering: a new approach to multi-scale description. In: 9th IEEE ICASSP (1984)
Google Scholar
Zhang, D., Lu, G.: A comparative study of fourier descriptors for shape representation and retrieval. In: 5th Asian Conference on Computer Vision, Patterns and Images, pp. 646–651 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

UNESP - São Paulo State University, Bauru, SP, Brazil
Lucas Alexandre Ramos & Aparecido Nilceu Marana
UFSCar - Federal University of São Carlos, São Carlos, SP, Brazil
Luis Antônio de Souza Junior

Authors

Lucas Alexandre Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Aparecido Nilceu Marana
View author publications
You can also search for this author in PubMed Google Scholar
Luis Antônio de Souza Junior
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucas Alexandre Ramos .

Editor information

Editors and Affiliations

Universidad Federico Santa María, Santiago, Chile
Marcelo Mendoza
Carlos III University of Madrid, Madrid, Spain
Sergio Velastín

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramos, L.A., Marana, A.N., de Souza Junior, L.A. (2018). A Scale-Space Approach for Multiscale Shape Analysis. In: Mendoza, M., Velastín, S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science(), vol 10657. Springer, Cham. https://doi.org/10.1007/978-3-319-75193-1_65

Download citation

DOI: https://doi.org/10.1007/978-3-319-75193-1_65
Published: 04 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75192-4
Online ISBN: 978-3-319-75193-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)