Phase correlation with sub-pixel accuracy: A comparative study in 1D and 2D

https://doi.org/10.1016/j.cviu.2015.03.011Get rights and content

Highlights

  • Six methods for the accurate estimation of phase-correlation maxima are evaluated.

  • Methods are tested under noise, extreme transformations, incomplete data, and for real cases with unknown transformations.

  • Sinc function fitting provides the best average accuracy.

  • Local Center of Mass, and Minimization of the POC gradient provide good balance between accuracy and efficiency.

Abstract

Six methods for the accurate estimation of the phase-correlation maxima are discussed and evaluated in this article for one- and two-dimensional signals. The evaluation was carried out under a rigid image registration framework, where artificially generated transformations were used in order to perform a quantitative assessment of the accuracy of each method and its robustness in the presence of noise, incomplete data, or extreme transformations. Another round of tests were performed with real cases where the true transformation is unknown, and not necessarily rigid; for these tests, quantitative evaluation was achieved by means of the root mean square error of the overlapping area between the two aligned images. While most methods behaved similarly under difficult conditions, three of the methods under study displayed clear advantages under mild levels of noise, low transformation complexity, and small percentages of missing data. These methods are the local center of mass, sinc function fitting, and minimization of the POC gradient magnitude. The other tested methods included quadratic fitting, linear fitting in the frequency domain, and up-sampling; however, these methods did not perform consistently well.

Introduction

The phase correlation method [1] is a frequency domain technique used to estimate the delay or shift between two copies of the same signal. This technique is based on the shift properties of the Fourier transform. Specifically, consider two discrete periodic signals f(x) and g(x), and let F(ω) and G(ω) be their respective Fourier transforms. The normalized cross-spectrum R(ω) of f and g is given byR(ω)=F(ω)G*(ω)|F(ω)G*(ω)|,where G* is the complex conjugate of G. Note that |R(ω)|=1 for all ω. Also, the phase-only correlation (POC) function r(x) is defined as the inverse Fourier transform of R(ω).

Now suppose g is simply a delayed copy of f; that is, g(x)=f(x+d), where d is an unknown integer. The shift property of the Fourier transform states that G(ω)=F(ω)exp{jωd}, where j=1. In this case, it is easy to see that R(ω)=exp{jωd} and r(x)=δ(xd), where δ is the discrete impulse function (i.e., δ(0)=1 and δ(x)=0 for x0.25em0ex0.25em0ex0). Therefore, one can recover d by simply locating the maximum of r(x).

This method can be easily extended to 2D and 3D images, and has been successfully applied in several image processing and computer vision problems, such as image registration [2], [3], [4], [5], biometrics [6], [7], [8], stereo disparity estimation [9], [10], motion and optical flow estimation [11], [10], and video encoding [12], [10]. While newer methods have been proposed, such as Gradient Correlation [13], the phase correlation method is still very popular and widely used in a variety of applications.

One of the most important drawbacks of the basic phase correlation method, when implemented in the discrete-time domain, is that the recovered displacements have integer accuracy; i.e., the coordinates of the maximum of the discrete POC function will be a rounded version of the components of the true displacement vector. Various alternatives have been devised to estimate the displacements with non-integer (sub-pixel) accuracy. Among the most popular are those which rely on local function fitting: one can first obtain the displacement d0 with integer accuracy using the basic phase correlation method and fit a simple analytical function f(d) (e.g., a polynomial) to the POC values in a neighborhood of d0; then one maximizes f(d) to estimate the true maximum. The most common fitting functions are quadratic polynomials and Gaussian functions [14], [15], cubic splines [6], and Dirichlet or sinc functions [16], [17], [11]. In a recent approach, the authors derive an expression for the continuous gradient of the POC function, and use different search methods to find the minima of the magnitude of the gradient, which correspond to the extrema of the POC function [18]; since the search is performed around the initial integer-valued solution, this method is able to quickly estimate the location of the POC maximum with good accuracy.

Most of the methods described above perform reasonably well under controlled conditions but their accuracy is seriously degraded by noise, border effects, and the presence of multiple motions. This restricts many of the methods from being applied to some computer vision problems, such as stereo depth or optical flow estimation. Also, most of these methods were originally proposed in order to overcome the limitations of the classic POC method within the context of a specific application, without considering other possible approaches to the estimation of the POC maxima with sub-pixel accuracy.

A few papers have been published where some of the methods described above are compared. For instance, Argyriou and Vlachos perform a comparison by fitting different functions (quadratic, Gaussian, and sinc) around the integer-valued solution, in the context of motion-based video encoding [19]. This work shows that fitting a Sinc function produces the best results among the tested fitting functions, particularly in the presence of white additive noise. In another study, Vera and Torres evaluate five different methods, including up-sampling in the frequency domain, linear fitting in the frequency domain, and Gaussian fitting in the spatial domain, for the estimation of 2D translations under aliasing conditions [20]. The authors conclude that Gaussian function fitting provides results that are less affected by aliasing than the other methods under study. While these works provide a good reference point, they do not include some of the more recent proposals and also lack a detailed explanation of each method under study. Moreover, the batteries of tests reported in these works are relatively limited and sometimes do not show the particular strengths and shortcomings of each method.

In this work, a thorough comparison of six of the most representative methods for the estimation of POC maxima at sub-pixel level is presented, focusing on accuracy, success rate, and robustness to noise and missing data in a translational and rigid image registration framework. In Section 2 we will describe in detail all the methods under study. Section 3 will present some ideas that may increase the performance of some of the methods. In Section 4, the evaluation results for the estimation of translational and rigid transformations will be shown. And finally, Section 5 will present our conclusions.

Section snippets

Methods

In this section we present the most relevant methods for the estimation of POC maxima with sub-pixel accuracy. The description of each method considers both the 1D case (e.g., for stereo matching) and the 2D extension (e.g., for image registration). Extensions to higher dimensions (e.g., for volume registration) are possible, and in some cases straightforward. Throughout the article, square brackets will be used to denote N-periodic discrete-time signals (e.g., f[x], where x=0,,Nx1), and

Increased robustness

This section describes some general ideas to increase the robustness of the estimated peak locations in the presence of noise and border artifacts which may result from multiple motions. These ideas can be applied and produce significant benefits in most of the methods under study.

Results and discussion

The comparison of the six methods under study was carried out in two stages. In the first stage, all of the discussed methods were implemented in Octave for 1D and 2D signals. For each case, a series of tests were performed, where a known fractional shift d (chosen at random) was applied to a reference signal f(x) in order to obtain a circularly-shifted version g(x)=f(xd). Since d (or its components, in the 2D case) is not necessarily integer, spline interpolation was used to obtain a high

Conclusions

Six methods for the estimation of the phase-correlation maxima with sub-pixel accuracy were evaluated. These methods included: local fitting of quadratic (QuadFit) and sinc functions (SincFit), fitting a linear function in the frequency domain (LinFit), local center of mass (LCM), up-sampling in the frequency domain (UpSamp), and minimization of the magnitude of the POC gradient (GradPOC). Various tests were performed to assess the robustness of each method against additive noise, extreme

Acknowledgments

A. Alba was supported by CONACYT grant #154623. E. Arce was supported by CONACYT grant #168140. The authors would like to thank Otniel Garcia for providing the Face images, Aldo Mejia for the MRI images used in Section 4.4, and El Colegio de la Frontera Sur (ECOSUR) for the Aerial 2 images.

References (29)

  • C.D. Kuglin, D.C. Hines, The phase correlation image alignment method, in: Proc. of the IEEE Int. Conf. on Cybernetics...
  • E. De Castro et al.

    Registration of translated and rotated images using finite fourier transforms

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1987)
  • B.Srinivasa Reddy et al.

    An FFT-based technique for translation, rotation, and scale-invariant image registration

    IEEE Trans. Image Process.

    (1996)
  • Yosi Keller et al.

    Pseudopolar-based estimation of large translations, rotations, and scalings in images

    IEEE Trans. Image Process.

    (2005)
  • Yosi Keller et al.

    The angular difference function and its application to image registration

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • Jun-Zhou Huang et al.

    Phase correlation based iris image registration model

    J. Comput. Sci. Technol.

    (2005)
  • K. Ito, A. Morita, T. Aoki, T. Higuchi, H. Nakajima, K. Kobayashi, A fingerprint recognition algorithm using...
  • Radim Kolar, Viktor Sikula, Michael Base, Retinal image registration using phase correlation, in: Analysis of...
  • MohammadAbdul Muquit et al.

    A high-accuracy passive 3D measurement system using phase-based image matching

    IEICE Trans. Fund. Electron. Commun. Comput. Sci.

    (2006)
  • Alfonso Alba et al.

    Phase-correlation guided area matching for realtime vision and video encoding

    J. Real-Time Image Process.

    (2014)
  • Kenji Takita et al.

    A sub-pixel correspondence search technique for computer vision applications

    IECIE Trans. Fund.

    (2004)
  • Loy Hui Chien, Takafumi Aoki, Robust motion estimation for video sequences based on phase-only correlation, in: 6th...
  • Georgios Tzimiropoulos et al.

    Subpixel registration with gradient correlation

    IEEE Trans. Imag. Process.

    (2011)
  • Ikram E. Abdou, Practical approach to the registration of multiple frames of video images, in: Proc. SPIE, Visual...
  • Cited by (0)

    This paper has been recommended for acceptance by Anurag Mittal.

    View full text