Elsevier

Pattern Recognition Letters

Volume 83, Part 3, 1 November 2016, Pages 303-311
Pattern Recognition Letters

TSS & TSB: Tensor scale descriptors within circular sectors for fast shape retrieval

https://doi.org/10.1016/j.patrec.2016.06.005Get rights and content

Highlights

  • Two novel shape descriptors are presented, based on the tensor scale (TS) concept.

  • The orientation of TS ellipses is extended to 360°.

  • Experimental results with MPEG-7 and MNIST datasets validates the methods for CBIR.

  • The descriptors achieve a high retrieval rate comparable to state-of-the-art methods.

  • Both descriptors uses a simple distance function, with linear complexity in TSB.

Abstract

We propose two novel region-based descriptors for shape-based image retrieval and analysis, which are built upon an extended tensor scale based on the Euclidean Distance Transform (EDT). First the tensor scale algorithm is applied to extract local structure thickness, orientation, and anisotropy as represented by the largest ellipse within a homogeneous region centered at each image pixel. In this work, we extend the local orientation to 360°. Then, for the first proposed descriptor, named Tensor Scale Sector descriptor (TSS), the local distributions of relative orientations within circular sectors are used to compose a fixed-length feature vector for a region-based representation. For the second method, named Tensor Scale Band descriptor (TSB), we consider histograms of relative orientations for each circular concentric band to compose a fixed-length feature vector with linear time matching. Experimental results with MPEG-7 and MNIST datasets are presented to illustrate and validate the methods. TSS can achieve high retrieval values comparable to state-of-the-art methods, which usually rely on time-consuming correspondence optimization algorithms, but uses a simpler and faster distance function, while the even faster linear complexity of TSB leads to a suitable and better solution for very large shape collections.

Introduction

Content-based image retrieval (CBIR) concerns the problem of searching for digital images in large databases, which are similar to a query image. A particular type of CBIR system exploits the shape information as image descriptors and has important applications on several areas (e.g., fast identification of similar species to support biology research on biodiversity, trademark identification, fingerprint identification, visual-search for mobile applications, medical tumor shape retrieval [17], [26], [47], [50]).

In order to accomplish these goals, a shape descriptor should be simple, compact, insensitive to noise, affine-invariant, and at the same time contain all relevant information to distinguish different images [18]. Preferably, the matching algorithm used by a shape descriptor should also be fast in order to be suitable on large datasets, since the feature extraction can often be performed offline. Many proposed shape descriptors yield a high accuracy score but sacrifice performance with a high computational time, mainly relying on Dynamic Programming (DP) for the matching algorithm to establish correspondences. To exemplify, the inner-distance shape context (IDSC-DP) [23] and Height Functions [49] uses a DP scheme for the shape matching.

In this work, we present new shape descriptors designed to achieve a high accuracy with fast distance functions. Besides not using DP, our matching algorithms are also simpler than OCS (Optimal Correspondent Subsequence) used by BAS [5], OSB (Optimal Subsequence Bijection) and the Hungarian algorithm used in [6], and also more efficient than the one used in TSDIZ [2]. In relation to specially proposed methods for dealing with large datasets, such as Hough Transform Statistics (HTS) and HTS-neighborhood (HTSn) [44], our matching algorithms have a lower computational complexity, for non-aligned shapes, and higher retrieval rates.

Our methods are based on the tensor scale concept (TS) [31], [32] — a morphometric parameter yielding a simultaneous representation of local structure orientation, thickness, and anisotropy. Tensor scale has been demonstrated for image filtering [32], [51], segmentation [35], registration [33], shape descriptors [2], [25], detection of contour saliences [3], [4], medical image interpolation [52], [53], removal of partial volume effects in rendering [43] and also in quantifying local morphometry in complex quasi random networks of trabecular bone [37], [38]. The algorithm to compute tensor scale, as originally proposed [31], [32], is computationally expensive. To address this problem, Andaló et al. proposed a simpler and yet effective implementation of the original method [2]. More recently, Xu et al. introduced a precise analytic approach for n-dimensional images [51].

One contribution of this paper is the revision of the algorithmic TS computation, as proposed by Andaló et al., extending the ellipse’s orientation to 360°. Based on this richer TS model, we propose two novel shape descriptors, with greater discrimination power in relation to previous TS-based works, for shape-based image retrieval: Tensor Scale Sector (TSS) and Tensor Scale Band (TSB) descriptors.

The works about shape matching can be coarsely divided into four major categories: pairwise matching, context-based similarity, learning based approaches, and combined classifier approaches. The methods from the pairwise category extract a feature vector from the shapes and use a distance function to compute the perceptually dissimilarity (or similarity) between a pair of shapes. On the other hand, context-based methods exploit the similarity context of all database instances, usually provided as a result of a pairwise matching, to boost the results, and are important to deal with large intra-class variations [8], [28], [54]. Unsupervised learning approaches are usually inspired by the Bag-of-Words (BoW) paradigm [11], [42], where a shape vocabulary is constructed by partitioning the feature space. The quality of this partition is a crucial factor on the discriminative power of the resulting descriptor, which involves the analysis of features extracted from several shapes from the dataset [7]. The methods in the fourth category fuses multiple features obtained with different descriptors, combining their strengths for the purpose of better retrieval accuracy [41].

In this work, we focus on shape description and matching, proposing novel descriptors for pairwise matching that can be used with re-ranking techniques to boost the results, and also with combined classifier approaches.

There are three types of shape descriptors, contour-based, region-based, and skeleton-based [20]. Skeleton matching usually performs better in handling objects with articulated parts, but their complex representations, in the form of a tree or graph, usually require graph edit operations [15], [30] in the matching process leading to a higher computational complexity [39], [40]. Differently from contour-based methods, such as BAS [5], TSDIZ [2] and Height Functions [49], our proposed descriptors are region-based methods, having some advantages, such as not being limited to a specific topology with only one closed contour, and also less sensitive to shape segmentation errors, noise and partial occlusions [27]. Our methods are more versatile, for instance, they can handle non-binary images (multiple labels), objects with holes, and, theoretically, can be extended to 3D.

In relation to previous TS descriptors (TSD [25] and TSDIZ [2]), the proposed descriptors TSS and TSB are more accurate and have faster matching algorithms. TSS incorporates spatial information by the use of circular sectors and TSB by the use of concentric bands around a central point, which are much more discriminative than the simple normalized orientation histogram used by TSD. While TSDIZ is a contour-based method, TSS and TSB are region-based methods, opening new perspectives for novel applications. The features extracted from the ellipses are also more sophisticated, considering 360°, using sector’s relative orientations, and in the case of TSS, applying the sum of the absolute values of the responses, inspired by SURF [9], to detect different orientation patterns.

This paper is organized as follows: Section 2 presents the tensor scale previous relevant work, including its EDT-based implementation and how to extend it to ellipses with 360°, as used in this work. Then, our novel tensor scale descriptors, TSS and TSB, are shown in Sections 3 and 4, respectively, and the analysis of their computational complexity is presented in Section 5. The experimental evaluation is conducted in Section 6. Section 7 states our conclusions and discusses future work.

Section snippets

Background

Saha et al. have introduced a local scale method called tensor scale (TS) [31], [32], which is the parametric representation of the largest ellipse (or ellipsoid in 3D), centered at a point p within the same homogeneous region under a predefined criterion (usually intensity). The tensor scale model (Fig. 1) provides three factors: orientation θ of the major semi-axis t1, anisotropy (1t2(p)2/t1(p)2) and thickness (‖t2(p)‖). This approach is a natural evolution of their previous work based

TSS: Tensor scale sector

A single circular region around the object is considered, which is divided into sectors within concentric bands (Fig. 6). After that, tensor scale information is computed only inside this circular region, but the orientations of ellipses centered at pixels closer to the external circular border follow its round shape, and therefore do not present relevant information of the shape being analyzed. So in this work these ellipses are disregarded (Fig. 5).

Fig. 6 a shows an example grid using four

TSB: Tensor scale band descriptor

The idea of TSB is to sacrifice the angular displacement of the sectors in order to obtain a faster distance function. In contrast to TSS, the spatial information in TSB is incorporated via the radial displacement only, by taking one normalized histogram with 60 bins per radial band, capturing the relative angular distribution of γ. Due to the high coverage area of the radial bands, the angular distributions in the form of a histogram is more appropriate than considering only the

Computational complexity

In the context of CBIR, the computational cost consists of two parts: (i) computing the feature vector; (ii) performing the shape dissimilarity by a distance function. On large collections, the latter is more important as the distance should be determined for every shape in the collection against the query shape, and the descriptor of all shapes are already stored and calculated beforehand. To perform a new search, only the descriptor of the query shape must be calculated, so for a fast

Experimental results

The TSS and TSB descriptors were compared against commonly used shape descriptors. We evaluated the proposed descriptors using the MPEG-7 CE-Shape-1 (part B) dataset available from [19] and the MNIST dataset from [22]. The MPEG-7 dataset contains 1400 shape images distributed along 70 classes, where each class contains 20 shapes with various rigid and non-rigid transformations, noise and change of viewpoint. The MNIST database consists of 10,000 images of handwritten digits and is commonly used

Conclusions

We presented two novel shape descriptors for CBIR, which are non-limited to a particular topology and that may be easily tailored for different applications. Their features extracted from circular sectors could be used to build a shape vocabulary according to the BoW paradigm to get a learning based method, and also be combined with other features.

The evaluated TSS-41 is a compact, fast and effective descriptor for CBIR applications, and outperformed other relevant shape descriptors according

Acknowledgments

The authors thanks CNPq (305381/2012-1, 486083/2013-6, FINEP 1266/13), FAPESP grant #2011/50761-2, CAPES, and NAP eScience - PRP - USP.

References (54)

  • T.B. Sebastian et al.

    Curves vs. skeletons in object recognition

    Signal Process.

    (2005)
  • R.S. Torres et al.

    A digital library framework for biodiversity information systems

    Int. J. Digit. Libr.

    (2006)
  • WeiC. et al.

    Trademark image retrieval using synthetic features for describing global shape and interior structure

    Pattern Recognit.

    (2009)
  • XuZ. et al.

    Tensor scale: an analytic approach with efficient computation and applications

    Comput. Vis. Image Underst.

    (2012)
  • XuZ. et al.

    Recent improvements in tensor scale computation and its applications to medical imaging

    (2009)
  • YangX. et al.

    Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval

    2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009

    (2009)
  • T. Adamek et al.

    A multiscale representation method for nonrigid shapes with a single closed contour

    Circuits Syst. Video Technol. IEEE Trans.

    (2004)
  • F.A. Andaló et al.

    Detecting contour saliences using tensor scale

    IEEE International Conference on Image Processing

    (2007)
  • F.A. Andaló et al.

    A new shape descriptor based on Tensor Scale

    Mathematical Morphology and its Applications to Signal and Image Processing (International Symposium on Mathematical Morphology)

    (2007)
  • X. Bai et al.

    Path similarity skeleton graph matching

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2008)
  • X. Bai et al.

    Shape vocabulary: a robust and efficient shape representation for shape matching.

    IEEE Trans. Image Process.

    (2014)
  • X. Bai et al.

    Learning context-sensitive shape similarity by graph transduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • S. Belongie et al.

    Shape matching and object recognition using shape contexts

    Pattern Anal. Mach. Intell. IEEE Trans.

    (2002)
  • G. Csurka et al.

    Visual categorization with bags of keypoints

    Workshop on Statistical Learning in Computer Vision, ECCV

    (2004)
  • I. Dryden et al.

    Statistical Shape Analysis

    (1998)
  • A.X. Falcão et al.

    The image foresting transform: theory, algorithms, and applications

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2004)
  • F. Fotopoulou et al.

    Multivariate angle scale descriptor of shape retrieval

    Proceeding on Signal Processing and Applied Mathematics for Electronics and Communication

    (2011)
  • Cited by (8)

    • L-shaped geometry-based pattern descriptor serving shape retrieval

      2023, Expert Systems with Applications
      Citation Excerpt :

      Shape contours defined by Adaptive Discrete Contour Evolution (ADCE) (Temlyakov, Munsell, Waggoner, & Wang, 2010) intensified their robustness towards geometric transformations, non-linear deformations, and intra-class variations that assisted in escalating its retrieval performance. The region-based representation Tensor-based Scale Sector (TSS) (Freitas, Torres, & Miranda, 2016) constructed relatively oriented local distributed features within circular sectors. Similarly, Distance Interior Ratio (DIR) histograms (Kaothanthong, Chun, & Tokuyama, 2016) fabricated the intersection of line segment patterns present in shape.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Xiang Bai.

    View full text