A comparison of interest point and region detectors on structured, range and texture images☆
Introduction
Local features in computer vision outperform global features in the presence of large view point changes, clutter and occlusion [1]. Several local detectors and descriptors have been introduced in the literature and the problems such as object recognition have been addressed to a fairly good extent. However, the performance is subject to the application at hand. The algorithms, by large, address general computer vision data which are 2D color or grayscale images of scenes with either structured (man made objects constituted from simpler geometrical shapes) or a combination of structured and natural objects usually undergoing rigid transformations.
Images containing slightly different forms of data such as 3D or depth images in which color or texture is not present and the object’s shape is what defines all the shades may pose a slightly different challenge [2], [3], [4]. Another case is of the objects with smooth boundaries where the conventional interest point detection does not work very well [5]. In a textured scene, the concept of an interest point, a rigid object or transformation may be less appealing. Even more challenging are the images of deformable objects such as the outdoor images of plants, for example, with several leaves under the effects of biological growth, wind and light. In such cases, distinctive shape features can be obtained from edges and the shapes of their contours.
In a typical feature extraction process, interest points that lie along an edge are usually ignored because they offer low repeatability [8]. On the other hand, edges are high gradient neighborhoods and carry important information regarding the shape of the objects [9]. In the absence of colors and texture, edges become increasingly important as is the case with 3D or depth data. Owing to this reason, edge contours have been successfully used in object classification. Mikolajczyk et al. [9] used edge points with a Laplacian based measure for edge support selection to incorporate scale invariance. Shotton et al. [10] randomly selected few edge fragments and pruned them by intensity based clustering forming a code book of contours. Ferrari et al. [11] used chains of connected contour segments for the object class detection. A more common approach is using sampled edge points [12]. The concept of BoB (Bag-of-Boundaries) introduced by Arandjelovic and Zisserman [5] matched the post-segmentation boundaries for retrieval of objects with low surface texture and smooth boundaries. For a machine vision system, the worst deformations are projective transformations and developing robustness to them cannot be error free. Edge sampling, unfortunately, do not provide invariance to such deformations. The best compromise can be reached with affine invariant regions.
The Affine Invariant Regions provide partial invariance to projective transformations [13], [14]. Affine regions target different shape features of an object, such as Maximally Stable Extremal Regions (MSER) [15] seek the bounds of homogeneous stable regions following a watershed approach while Edge Based Regions (EBR) [14] find seed points from Harris corners and follow the edges along the two directions of the corner to trace the boundary of the edges.
The performance of affine regions has been remarkably good [16]. Many recent algorithms perform slightly better, such as Wαsh detector [17], Medial Feature Detector (MFD) [18] and Boundary Preserving dense Local Regions (BPLR) [19]. Mikolajczyk and Schmid [13] proposed Harris Affine and Hessian Affine regions which originate from Harris and Hessian corners, respectively, and occupy the affine covariant neighborhood in a multi-scale manner. Therefore, they tend to partially capture edges as shown in Fig. 1(g) and (i), which seems to be another way to include edge fragments with robustness against affine deformations. But they do not take into account the edge contours which limits the information content related to its shape (Fig. 1 shows some of the commonly used affine and scale invariant detector on a plant image).
The reason is that, in the process of finding a neighborhood covariant to affine transformations, the region (distinguished region [15], [16]) is iteratively mapped onto an ellipse (measurement region). In this process, the shape of the boundary contributing to the distinguished region could be lost, at least partially (Fig. 1(g) and (i)). Therefore, as argued by [9], the invariance to affine geometric or photometric deformations comes at the cost of the information content of the local features. If a fragment of the edge or boundary which holds a clue to its geometry is enclosed in the ellipse representing the measurement region, the extracted descriptor may also include features for the edge shape. But this is more a matter of chance than the design.
The non-affine counterparts, the Harris Laplace and Hessian Laplace (Fig. 1(f) and (h)), although avoid affine adaptation, but are limited to corners. They both start from corners and search for an extrema of Laplacian which signifies optimal scales, indirectly including some part of the edges and hence the subsequent descriptor partly addresses edge shapes. Mikolajczyk et al. [16] in their comparison of affine regions, point towards the need for developing affine regions for object boundaries.
The local feature evaluations in literature usually focus on either certain type of detectors [16], descriptors [20], or data such as 3D [4]. In this article, our objective is to assess the complexity of the features w.r.t that of the data. We evaluate multi-scale, scale invariant and affine invariant detectors on three publicly available datasets including building structures, depth images and textures with a number of descriptors both for shape and texture. These data and feature types are chosen so that a comprehensive comparison can be established to facilitate an application oriented reader in understanding the pros and cons of making a choice.
The end goal of our research is to prepare tools for plant leaf recognition for which the edge shapes are an important discriminant. Since affine or scale invariant detectors do not exclusively focus on edge shapes, therefore, we also elaborate the details of a graph based method for multi-scale interest region detection introduced in [21] followed by its thorough evaluation. The idea is to avoid the affine adaptation and focus on pure edge shapes similar to edge sampling, but in a multi-scale manner. The underlaying hypothesis is that it would be a better alternative to edge sampling as a multi-scale support approximates affine transformations locally [9].
The distribution of this article is as follows. The scheme of the novel edge shape detector is described in Section 2. Evaluation of the proposed detector as well as the selected interest points and affine invariant regions is done in Section 3 with repeatability tests in Section 3.1 and image retrieval tests in Sections 3.2.1 Zurich building database, 3.2.2 Stuttgart range image database, 3.2.3 Normalized Brodatz texture database. Section 4 concludes the paper.
Section snippets
Multi-scale edge shape detector
To develop an edge shape detector, we do not use edge maps. Instead a novel approach using graph based image decomposition is introduced. It is based upon the fact that edges produce high gradients in their immediate neighborhood and will therefore reflect some distinguishable patterns in the process of image decomposition. The idea is loosely based on Kadir–Brady Saliency detector (KBS). Salient regions as introduced by [22] detect a region of distinguishable variations of entropy over scales.
Evaluations and discussions
TLR’s repeatability and matching potential was evaluated and compared with other affine region detectors reported by [16]. Afterwards, image retrieval and classification tests were performed on three publicly available datasets.
Conclusions
In this article a comprehensive evaluation of several affine and scale invariant detectors was conducted. Furthermore, a novel multi-scale edge shape detector, the Twin Leaf Regions (TLR) was elaborated. It detects multi-scale regions around boundaries/edges using the outliers of a graph based compression algorithm. TLR was evaluated in repeatability, matching and image retrieval potential. In order to reduce the offset from the edges, affine adaptation was avoided. On images containing
Acknowledgments
This research was supported by the Danish Council for Strategic Research under ASETA project (www.aseta.dk), Grant No. 09-067027. Special thanks to late Dr. Nikolai Chernov, Department of Mathematics, University of Alabama at Birmingham, USA and Dr. Richard Brown, Massey University, New Zealand for their help and cooperation.
References (43)
- et al.
Verification of color vegetation indices for automated crop imaging applications
Comput. Electron. Agric.
(2008) - et al.
Speeded-up robust features (SURF)
Comput. Vision Image Underst.
(2008) - et al.
Local invariant feature detectors: a survey
Found. Trends Comput. Graph. Vision
(2007) - B. Steder, R. Rusu, K. Konolige, W. Burgard, NARF: 3D range image features for object recognition, in: Workshop on...
- et al.
On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes
Int. J. Comput. Vision
(2010) - et al.
Performance evaluation of 3D keypoint detectors
Int. J. Comput. Vision
(2013) - R. Arandjelovic, A. Zisserman, Smooth object retrieval using a bag of boundaries, in: International Conference on...
A threshold selection method from gray-level histograms
IEEE Trans. Syst., Man Cybernet.
(1979)- D. Lowe, Object recognition from local scale-invariant features, in: The Proceedings of the Computer Vision, vol. 2,...
- K. Mikolajczyk, A. Zisserman, C. Schmid, Shape recognition with edge-based features, in: Proceedings of the British...
Multiscale categorical object recognition using contour fragments
IEEE Trans. Pattern Anal. Mach. Intell.
Groups of adjacent contour segments for object detection
IEEE Trans. Pattern Anal. Mach. Intell.
Shape matching and object recognition using shape contexts
IEEE Trans. Pattern Anal. Mach. Intell.
Scale and affine invariant interest point detectors
Int. J. Comput. Vision
Matching widely separated views based on affine invariant regions
Int. J. Comput. Vision
Robust wide-baseline stereo from maximally stable extremal regions
A comparison of affine region detectors
Int. J. Comput. Vision
Boundary preserving dense local regions
A performance evaluation of local descriptors
IEEE Trans. Pattern Anal. Mach. Intell.
Cited by (8)
Exploiting affine invariant regions and leaf edge shapes for weed detection
2015, Computers and Electronics in AgricultureCitation Excerpt :It contains images of leaves of 15 tree species, 75 images each, which is sufficiently diverse for the purpose (see Fig. 2 for samples). Several scale and affine invariant as well as our novel multi-scale edge shape detector (TLR) (Kazmi and Andersen, 2015) were used. Additionally, a new local descriptor for leaf surface detector is also introduced, which is based on color vegetation indices that are widely used in agricultural applications (Meyer and Neto, 2008).
Geometry and Bionic Fusion Feature Extraction Method for Affine Target Recognition
2023, Tien Tzu Hsueh Pao/Acta Electronica SinicaDevelopment of computational vision methodologies for monitoring cuttings in the drilling fluid treatment system
2023, Brazilian Journal of Chemical EngineeringEarly Detection of Stripe Rust in Winter Wheat Using Deep Residual Neural Networks
2021, Frontiers in Plant Science
- ☆
This paper has been recommended for acceptance by Prof. M.T. Sun.