Using image signatures for place recognition
Introduction
A central problem for a mobile robot is localization, determining its location from sensed data. When a complete model of the environment is not available a priori, this problem may become quite difficult. In the topological mapping paradigm (Kuipers and Byun, 1988; Levitt et al., 1987), the environment is discretized into a number of different `places', so that the world can be represented as a graph of these places, whose arcs represents possible movements between places. This makes localization easier, as the problem reduces to recognizing at which place, of a given set, the robot is located. A number of methods have been proposed for place recognition, including certainty grids (Elfes, 1987, Yamauchi and Langley, 1997) and local geometry (from vision) (Braunegg, 1990, Taylor and Kriegman, 1992). Both of these types of methods are computationally intensive and can be sensitive to even small changes in the environment. In this paper we empirically evaluate another approach, that of image matching, in which recognition is performed by matching 2D representations of input images. The method we use is based on image signatures, coarse-scale representations of image properties such as edge-direction or image homogeneity. An image signature is an array of values, each derived by a measurement of an image property over a portion of the input image (as depicted in Fig. 1). Computing and matching image signatures is efficient, and is insensitive to small changes in the environment.
Similar image matching approaches have previously been applied to robotic and other recognition tasks. For example, Horswill (1993) used coarse grey-scale images for place recognition for a mobile robot, using strong geometric constraints. 2D image matching has also been used extensively for non-robotic recognition tasks. In particular, grey-scale correlation of image templates has been used for face recognition (Brunelli and Poggio, 1993), iris recognition (Wildes et al., 1996), and finding volcanoes in images of Venus (Wiles and Forshaw, 1993). Grey-scale correlation works well in such cases, where well-defined templates (models of image regions) exist.
A different set of techniques are the feature-based approaches, in which a set of feature values are extracted from an input image and the resultant feature vector are matched to a set of model feature vectors. Feature-based recognition has been applied to 2D object recognition (Crowley and Sanderson, 1987, Sethi and Ramesh, 1992, Beveridge and Riseman, 1997). Similar approaches have also been applied to the problem of image retrieval from multimedia databases (Jin et al., 1996, Manjunath and Ma, 1996, Schmid and Mohr, 1997). Color histograms have also been used successfully for image retrieval (Swain, 1990, Mehtre et al., 1995).
Our method of using image signatures incorporates some of the benefits of both image-based and feature-based approaches. The approach is based on Nelson's (Nelson, 1989) use of directional image signatures for stimulus-response robotic navigation. We extend this to include signatures for several different types of image measurements, including edge direction and strength, as well as texture-based measures. By using feature-based measures, rather than straight grey-scale values, we expect to attain more robust recognition in the face of illumination or small environmental changes. On the other hand, by looking at measures over large subimages, rather than individual features, we expect recognition to be insensitive to larger environmental changes, such as moved objects and the like.
An approach similar to that described in this paper was used for fingerprint recognition by Coetzee and Botha (1993). There, the authors used a type of image signature computed from a skeletonized image of a fingerprint, and applied a variety of matching techniques for recognizing fingerprints. The task of place recognition differs from fingerprint recognition, however, in that images of the world are more visually complex than fingerprints, while on the other hand the differences between fingerprints may be very subtle. In addition, the place recognition method described in this paper also explicitly considers the possibility of changes in viewer position at the same location.
This paper is organized as follows. In the following section, we will describe how image signatures are computed, including a description of several measurement functions that we use, and how signatures are compared for similarity. We also describe a conjunctive matching scheme, where several measurement functions are used in tandem for recognition. We develop measures for evaluating the quality of different measurement functions and combinations thereof when developing a recognition system based on image signatures. Section 2describes a series of experiments that demonstrate how the method of image signatures is useful for place recognition. We conclude that the method of image signatures is useful for matching complex images, particularly when using several measurement functions together.
Section snippets
Image signatures
We define an image signature as an array of measurement values, each value derived from a portion of the original image. The values are derived by applying measurement functions to subimages. Each measurement function computes an estimate of a coarse-scale image property, such as dominant edge direction, or degree of texturedness. The process of signature computation and matching is as follows (see Fig. 2).
First, the input image I is tessellated on a regular n×n grid giving subimages Ii,j (0⩽i,j
Experimental results
We conducted experiments on signature matching using the measurement functions described above. Images with 256 grey levels were taken by a CCD camera with a 25 mm lens, giving a field-of-view of about 12°. We reduced the source images using a Gaussian filter to images of 64×48 pixels, on which signatures were computed. Except as otherwise noted below, system parameters were as follows. Signatures were 8×8 grids. In conjunctive matching, no mismatches were allowed.
Matching thresholds were set
Conclusions
This paper describes an image-based method for robot place recognition based on image signatures. The advantages of the method are its simplicity, efficiency and modularity. We experimented with several different functions for deriving signatures from images, finding that functions based on gradient direction (DEO, SGD) or strength (ESTR) tended to work best. Combining functions using conjunctive filtering, however, was found to improve recognition significantly. The results of this study
Acknowledgements
Thanks to Drew McDermott and Michael Beetz for many helpful discussions during the course of this research. Also thanks to Ido Dagan, Gerhardt Kraetzschmar, and the anonymous reviewers for their helpful comments on earlier drafts of this paper. This work was partially performed while the author was supported by a fellowship from the Fannie and John Hertz Foundation.
References (21)
- et al.
A scheme for intelligent image retrieval in multimedia databases
J. Visual Commun. Image Representation
(1996) - Argamon-Engelson, S. Finding recognizable views using image signatures. Under...
- et al.
How easy is matching 2D line models using local search?
IEEE Trans. Pattern Anal. Mach. Intell.
(1997) - Braunegg, D.J., 1990. MARVEL: A System for Recognizing World Locations with Stereo Vision. Ph.D. Thesis,...
- et al.
Face recognition: Features versus templates
IEEE Trans. Pattern Anal. Mach. Intell.
(1993) - Coetzee, L., Botha, E.C., 1993. Fingerprint recognition in low quality images. Pattern Recognition 26...
- et al.
Multiple resolution representation and probabilistic matching of 2-D gray-scale shape
IEEE Trans. Pattern Anal. Mach. Intell.
(1987) Sonar-based real-world mapping and navigation
IEEE J. Robot. Automat.
(1987)- Horswill, I., 1993. Specialization of perceptual processes. Ph.D. Thesis, MIT Artificial Intelligence...
- Kuipers, B., Byun, Y.-T., 1988. A robust qualitative method for robot spatial reasoning. In: Proceedings of National...
Cited by (16)
Indoor sound field feature matching for robot's location and orientation detection
2008, Pattern Recognition LettersReal-time appearance-based Monte Carlo localization
2006, Robotics and Autonomous SystemsCitation Excerpt :There is, however, no guarantee that the same rotation will result if there are any occlusions in the image. Instead of storing raw omnidirectional camera images, a dimensionality reduction to image signatures [3] can be performed. The data reduction can be done using Principal Component Analysis (PCA) on the panoramic view [59,15,28,45,58], preserving the largest amount of variance in the projection to a low-dimensional subspace.
Localisation in large-scale environments
2001, Robotics and Autonomous SystemsImage analysis and computer vision: 1998
1999, Computer Vision and Image UnderstandingA comparison of efficient global image features for localizing small mobile robots
2010, Joint 41st International Symposium on Robotics and 6th German Conference on Robotics 2010, ISR/ROBOTIK 2010A method to estimate robot's location using vision sensor for various type of mobile robots
2009, 2009 International Conference on Advanced Robotics, ICAR 2009
- 1
Tel.: 972 3 531 8407; fax: 972 3 535 3325; e-mail: [email protected].