Anytime similarity measures for faster alignment

doi:10.1016/j.cviu.2007.09.011

Computer Vision and Image Understanding

Volume 110, Issue 3, June 2008, Pages 378-389

https://doi.org/10.1016/j.cviu.2007.09.011 Get rights and content

Abstract

Image alignment refers to finding the best transformation from a fixed reference image to a new image of a scene. This process is often optimizing a similarity measure between images, computed based on the image data. However, in time-critical applications state-of-the-art methods for computing similarity are too slow. Instead of using all the image data to compute similarity, one could use only a subset of pixels to improve the speed, but often this comes at the cost of reduced accuracy. These kinds of tradeoffs between the amount of computation and the accuracy of the result have been addressed in the field of real-time artificial intelligence as deliberation control problems. We propose that the optimization of a similarity measure is a natural application domain for deliberation control using the anytime algorithm framework. In this paper, we present anytime versions for the computation of two common image similarity measures: mean squared difference and mutual information. Off-line, we learn a performance profile specific to each measure, which is then used on-line to select the appropriate amount of pixels to process at each optimization step. When tested against existing techniques, our method achieves comparable quality and robustness with significantly less computation.

Introduction

The need to align, or register, two images is one of the basic problems of computer vision. It can be defined as the task of finding the spatial mapping that places elements in one image into meaningful correspondence with elements in a second image. It is essential for data fusion tasks in medical imaging [1] and remote sensing (e.g. [2]). It is also widely applied in tracking and automatically mosaicking photographs [3].

One of the most straightforward and widely used approaches is referred to as direct image alignment. It works by defining a similarity measure, $D$ , as a function of a reference image, and a template image warped by a transformation with some parameters, $ϕ$ . The computation of $D$ typically requires examining all pixels in each image. The alignment problem becomes that of finding the values of $ϕ$ that maximize the chosen similarity measure. A number of optimization techniques for smooth functions, such as gradient descent, have been used for this problem, and provide good solutions on a wide range of image types. However, these approaches can be slow, which reduces their usefulness in time-sensitive applications such as real-time video registration (e.g. [4]) and medical image registration during surgery (e.g. [5]). It is possible to increase the speed of processing by using only a subset of the pixels to compute $D$ , but this can easily lead to a reduction in accuracy and reliability. Determining the size of the subset to obtain good performance is typically done in an ad-hoc fashion, or using heuristics which are applicable only to certain domains. Furthermore, since a different number of pixels may be needed at different stages in the optimization, a fixed subset is necessarily a compromise.

The process of intelligently dealing with tradeoffs such as this one between computational speed and accuracy is referred to as deliberation control. Deliberation control methods rely on two key components: algorithms that support partial evaluation, and knowledge about how those algorithms perform after different amounts of computation. The anytime algorithms [6], [7] are a, class of algorithms supporting partial evaluation, which provide a solution when run for any length of time. The solution quality is guaranteed to improve with the amount of computation performed. In this paper, we propose to use a deliberation control framework using anytime algorithms to arrive at a principled solution to the speed vs. accuracy trade-off in this problem. The first step is an off-line training process to learn the properties of the similarity measure under consideration, in terms of accuracy vs. computation time, by analysing image pairs for which the transformation parameters are known. Given a new pair of images to align, we then use this knowledge to determine the number of pixels that need to be considered at each step of the optimization. In this paper, we explore the effectiveness of this approach using two common similarity measures, mean squared difference and mutual information, and a gradient descent optimizer. We tested the algorithm on several types of images: images of everyday scenes, multimodal medical images and earth observation data (i.e. Landsat and Radarsat images). In all cases, using a deliberation control approach is faster than computing the transformation using all the image data and gives more reliable results than simply performing the optimization using an arbitrary, fixed, percentage of the pixels.

The remainder of this paper is organized as follows. In Section 2 we review the image alignment problem and in Section 3 we review methods of deliberation control using anytime algorithms. The details of how deliberation control has been implemented in the context of image alignment are given in Section 4. Finally, Sections 5 Experiments, 6 Conclusions and future work describe our experimental setup, results and conclusions.

Section snippets

Image alignment

Image alignment or registration is the search for a mapping from the coordinate system of one image to that of another that makes corresponding elements in each map to the same point in space. There are numerous approaches to the problem (for reviews, see [3], [8], [9]) which are frequently classified into two main approaches. Feature based approaches extract a set of features from the image, establish correspondences between them and derive the transformation from these correspondences. Direct

Deliberation control with anytime algorithms

In many artificial intelligence tasks, e.g. planning, the quality of the solution obtained depends on the amount of time spent on computation. Hence, trade-offs are necessary between the cost of sub-optimal solutions and the cost of spending time doing further computation. This process, called deliberation control, has been investigated in the context of real-time artificial intelligence and a number of approaches have been proposed [17]. In order for such an approach to be practical, it must

Deliberation control in image alignment

As mentioned in Section 2, the most computationally intensive part of image alignment is the repeated evaluation of the similarity measure, $D$ , and its gradient, $\nabla_{ϕ} D$ . The optimization algorithm needs this information in order to take a step in parameter space towards the optimal setting. Note that the calculation only has to be accurate enough to ensure that the next step is correct; determining these values exactly is not necessary. Therefore, we propose to implement similarity measures, and

Experiments

To test the anytime algorithm approach, a number of performance profiles were generated off-line, and alignments were performed on a different set of images on-line during testing. Four classes of images (shown in Fig. 2) with at least two image pairs each were used in the testing process. The first image class consisted of typical digital photos (DP) (images a–d) Both (a) and (d) were self-aligned and (a) was aligned affinely against several images of the same scene taken from different camera

Conclusions and future work

We proposed to use deliberation control methods in order to improve the efficiency of computer vision applications. We implemented such methods for the image alignment problem and showed a significant improvement in speed without degrading the quality of the results. In certain cases, the deliberation control approach significantly outperforms a simple reduction in the number of pixels because it can selectively use more pixels when needed.

Even when the performance gains are limited, a major

References (30)

X. Pennec et al.
Tracking brain deformations in time sequences of 3D US images
Pattern Recognition Letters
(2003)
B. Zitova et al.
Image registration methods: a survey
Image and Vision Computing
(2003)
E. Malis et al.
A unified approach to visual tracking and servoing
Robotics and Autonomous Systems
(2005)
E. Horvitz et al.
Editorial: Computational tradeoffs under bounded resources
Artificial Intelligence
(2001)
A.A. Cole-Rhodes et al.
Multiresolution registration of remote sensing imagery by optimization of mutual information using a stochastic gradient
IEEE Transactions on Image Processing
(2003)
R. Szeliski, Image alignment and stitching: a tutorial, Technical Report MSR-TR-2004-92, Microsoft Research (December...
R.P. Wildes, D. Hirvonen, S. Hsu, R. Kumar, W. Lehman, B. Matei, W. Zhao, Video georegistration: algorithm and...
E.J. Horvitz
Reasoning about beliefs and actions under computational resource constraints
T. Dean et al.
An analysis of time-dependent planning

L.G. Brown

A survey of image registration techniques

ACM Computing Surveys

(1992)

G.D. Hager et al.

Efficient region tracking with parametric models of geometry and illumination

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1998)

R. Fletcher

Practical Methods of Optimization

(1987)

J.C. Spall

Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control

(2003)

S. Baker et al.

Lucas-Kanade 20 years on: a unified framework

International Journal of Computer Vision

(2004)

Cited by (5)

Use of random time-intervals (RTIs) generation for biometric verification
2009, Pattern Recognition
We explore the possibility of using human-generated time-series as biometric signature. Adopting a simple psychometric procedure, in which a button is pressed in entirely random manner, successive elapsed times are registered and gathered in a signal reflecting user's internal cognitive processes. By reconstructing and comparing the dynamics across repetitions from the same subject a noticeable consistency was observed. Moreover, the dynamics showed a prominent idiosyncratic character when realizations from different subjects were contrasted. We established an appropriate similarity measure to systematize such comparisons and experimentally verified that it is feasible to restore someone's identity from RTI (random time-interval) signals. By incorporating it in an SVM-based verification system, which was trained and tested using a medium sized dataset (from 40 persons), a considerably low equal error rate (EER) of ∼5% was achieved. RTI signals can be collected effortlessly and this makes our approach appealing, especially in transactions mediated by standard pc terminal keyboards or even telephone keypads.
Similarity Matching in Computer Vision and Multimedia
2008, Computer Vision and Image Understanding
Automatic scraper of celebrity images from heterogeneous websites based on face recognition and sorting for profiling
2016, International Journal of Control Theory and Applications
User authentication through mouse dynamics
2013, IEEE Transactions on Information Forensics and Security
Efficient registration of multitemporal and multisensor aerial images based on alignment of nonparametric edge features
2010, Journal of Electronic Imaging

View full text

Anytime similarity measures for faster alignment

Abstract

Introduction

Section snippets

Image alignment

Deliberation control with anytime algorithms

Deliberation control in image alignment

Experiments

Conclusions and future work

Pattern Recognition Letters

Image and Vision Computing

Robotics and Autonomous Systems

Artificial Intelligence

Multiresolution registration of remote sensing imagery by optimization of mutual information using a stochastic gradient

IEEE Transactions on Image Processing

Reasoning about beliefs and actions under computational resource constraints

An analysis of time-dependent planning