Canonical subsets of image features
Introduction
Many computer vision tasks include feature detection as an integral step. The features detected in an image of an object form an abstract representation of the object which can be used by higher level vision processes such as localization and recognition. Recent trends in feature detection have been in the development of methods that can extract large numbers of rich features from a given image. One consequence of these trends has been the increased size of an object’s representation. Nevertheless, there are applications where reducing the size of an object’s representation is desirable, as shown by the work of Heisele et al. [1], Sivic and Zisserman [2], and Sun et al. [3].
Consider the case where hundreds of thousands of object model representations reside in a database. The problem is to determine which of the objects are present in input images and to localize them. Such applications are ripe for parallel processing or even custom hardware solutions. In these cases, the size of an object’s representation will become critical to the overall system performance or even its feasibility.
The size of an object representation is a combination of the number of features and the feature descriptor size. The size of the representation can be made smaller by reducing the size of the feature descriptor, by reducing the number of features, or both. Recent work by Ke and Sukthankar [4] has shown that feature descriptors can be reduced in size. Johnson and Herbert [5] also addressed the problem of descriptor compression and demonstrated that spin images could be compressed with principle component analysis. Our research focus, which is complementary to theirs, is on reducing the number of features.
We hypothesize that some of the detected features are more representative than others, and a subset of detected features is sufficient for vision tasks like object localization. In theory, for determining 3D positions and orientations of objects in 2D images, three disjoint co-visible features per object are sufficient [6], [7]. The number of necessary features is difficult to determine and depends on the specific problem instance and features. For example, some features such as those described by Lowe et al. [8], encode orientation and scale, which may allow localization with the matching of a single feature. In practice, however, more features are required since portions of the object in the target scene may be occluded or the data might be noisy. More importantly, more features are required because it is desirable to over-constrain the pose estimation problem so that accurate localization results can be obtained.
Rather than determining how many features are necessary or sufficient, we focus on how to best select a subset of image features given a redundant set. Our goal is to select a subset that is highly representative of the original feature set. In order to be robust in the presence of occlusion, a representative subset should be spatially distributed across all regions of the input image. Features from all regions of an object are desirable since it is not possible to predict which regions will be occluded. At the same time, the subset should consist of features that are similar to the others not in the subset, i.e., the subset should well represent the original set. The selected subset of features should also be as dissimilar as possible, as there is no point including similar features from a homogeneous region. To be robust in the presence of noise, the subset selection algorithm should be able to take into account the relative stability of a feature; that is, including more stable (repeatable) features in the representative subset will increase the robustness of the subset to noise [9].
One way of selecting a subset of features is to cluster the features and then select those features nearest the cluster centroids as representatives. Our view is that subset selection is a problem that is distinctly different from the clustering problem. Clustering seeks to partition the input set into groups of similar elements. Clustering methods such as K-Means [10] use distance measures to group similar elements. Conflicting goals must be encoded into the distance measure and global constraints are difficult to enforce because of the localized nature of the algorithms. That these concerns are difficult to remedy is not surprising, since the algorithms were not designed for subset selection.
Our approach to the feature subset selection problem is to formulate it within an optimization framework that combines the multiple objectives of spatial distribution, similarity, and stability. We refer to this subset as the stable bounded canonical set (SBCS). The SBCS consists of the most representative (canonical) features with maximum stability. Here, the stability of a feature is a quantitative measure of its response to the feature detector.
We formulate the problem of determining the SBCS as a nonlinear integer programming optimization and present algorithms to approximate it using quadratic and semidefinite programming. To evaluate the quality of the subsets generated by our algorithm, we study subsets of image features in the context of object localization under occlusion. To measure the performance of feature subset algorithms we present a dataset of synthetic images, along with ground-truth information, which enables the precise measurement of localization accuracy under occlusion. Our experiments show that subsets of image features produced by our method, stable bounded canonical sets (SBCS), outperform subsets produced by K-Means clustering,threshold, and GA-based methods for the task of object localization under occlusion.
The rest of the paper is organized as follows: In Section 2, we review related work. In Section 3, we present our algorithm, including problem formulation and quadratic integer formulation along with descriptions of how approximate solutions may be computed. Section 4 describes experiments used to evaluate the utility of our algorithm. Lastly, Section 5 presents our conclusions and future work.
Section snippets
Related work
In our work, we seek to find a representative subset of image features. This problem is closely related to the problem of representing a data set with a smaller, more compact, representation. Rate distortion theory, described by Cover and Thomas [11], represents an input sequence, X, as a set of reproduction points, , defined by a mapping . The distortion between X and is the average distance between the sequences using a given measure such as the Hamming distance. The reproduction
Stable bounded canonical sets
In this section, we present our algorithm for the stable bounded canonical set (SBCS) of features. Our algorithm for feature subset selection is based on a formulation of the SBCS problem in terms of a quadratic integer programming optimization. Many problems of this type are known to be intractable [36], but good approximations exist [24]. In what follows, we formulate the quadratic integer program and show how approximated solutions may be obtained using quadratic programming and semidefinite
Experiments
In this section, we present a number of experiments that test the performance of our SDP subset selection technique. The experiments presented in this section are in the context of object localization, the process of determining the position, orientation, and scale of a query object in a target scene. First, in Section 4.1, we describe the feature detectors that are used in the experiments. In Section 4.2, we describe the algorithms that are used for object localization. Next, in Sections 4.3
Conclusions
We have presented a method to reduce a large set of image features to a small subset, the SBCS, while preserving its descriptive power over the original image features. We derived an efficient approximation algorithm using semidefinite programming to compute the SBCS that is at least 0.878 of optimal [37]. In addition, we presented a quadratic programming (QP) approximation for SBCS as well as preliminary results which show that it performs well in practice. To demonstrate the utility of the
Acknowledgments
This work was funded in part by a grant from the Office of Naval Research (ONR-N00014-08-1-0925). The authors thank Tom Plick for his implementation of the GA algorithm of Morita et al. [60] and his helpful comments. The authors also thank Jianbo Shi and Sven Dickinson for their insightful comments.
References (61)
- et al.
Wrappers for feature subset selection
Artificial Intelligence
(1997) - et al.
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences
(1997) - et al.
Feature reduction and hierarchy of classifiers for fast object detection in video images
- J. Sivic, A. Zisserman, Video google: a text retrieval approach to object matching in videos, in: Proceedings of the...
- Z. Sun, G. Bebis, R. Miller, Object detection using feature subset selection, in: Pattern Recognition, vol. 37,...
- et al.
PCA-SIFT: a more distinctive representation for local image descriptors
- et al.
Using spin images for efficient object recognition in cluttered 3 d scenes
IEEE Transactions on Pattern Analysis and Machine Intelligence
(1999) - D. Huttenlocher, S. Ullman, Object recognition using alignment, in: Proceedings of the First International Conference...
- et al.
Recognizing solid objects by alignment with an image
International Journal of Computer Vision
(1990) - D.G. Lowe, Object recognition from local scale-invariant features, in: Proceedings of the International Conference on...
Evaluation of interest point detectors
International Journal of Computer Vision
Information Theory, Inference, and Learning Algorithms
Elements of Information Theory: Rate Distortion Theory
The statistical utilization of multiple measurements
Annals of Eugenics
Simplified calculation of principal components
Psychometrika
Feature transformation and subset selection
IEEE Intelligent Systems
Interior point methods in semidefinite programming with applications to combinatorial optimization
SIAM Journal on Optimization
Semidefinite programming in combinatorial optimization
Mathematical Programming
Derandomizing approximation algorithms based on semidefinite programming
SIAM Journal on Computing
Subgraph matching with semidefinite programming
Hierarchical image segmentation based on semidefinite programming
Cited by (0)
- 1
Work performed while at Department of Computer Science, Drexel University.