Phase correlation with sub-pixel accuracy: A comparative study in 1D and 2D☆
Introduction
The phase correlation method [1] is a frequency domain technique used to estimate the delay or shift between two copies of the same signal. This technique is based on the shift properties of the Fourier transform. Specifically, consider two discrete periodic signals f(x) and g(x), and let F(ω) and G(ω) be their respective Fourier transforms. The normalized cross-spectrum R(ω) of f and g is given bywhere G* is the complex conjugate of G. Note that for all ω. Also, the phase-only correlation (POC) function r(x) is defined as the inverse Fourier transform of R(ω).
Now suppose g is simply a delayed copy of f; that is, , where d is an unknown integer. The shift property of the Fourier transform states that , where . In this case, it is easy to see that and , where δ is the discrete impulse function (i.e., and for ). Therefore, one can recover d by simply locating the maximum of r(x).
This method can be easily extended to 2D and 3D images, and has been successfully applied in several image processing and computer vision problems, such as image registration [2], [3], [4], [5], biometrics [6], [7], [8], stereo disparity estimation [9], [10], motion and optical flow estimation [11], [10], and video encoding [12], [10]. While newer methods have been proposed, such as Gradient Correlation [13], the phase correlation method is still very popular and widely used in a variety of applications.
One of the most important drawbacks of the basic phase correlation method, when implemented in the discrete-time domain, is that the recovered displacements have integer accuracy; i.e., the coordinates of the maximum of the discrete POC function will be a rounded version of the components of the true displacement vector. Various alternatives have been devised to estimate the displacements with non-integer (sub-pixel) accuracy. Among the most popular are those which rely on local function fitting: one can first obtain the displacement d0 with integer accuracy using the basic phase correlation method and fit a simple analytical function f(d) (e.g., a polynomial) to the POC values in a neighborhood of d0; then one maximizes f(d) to estimate the true maximum. The most common fitting functions are quadratic polynomials and Gaussian functions [14], [15], cubic splines [6], and Dirichlet or sinc functions [16], [17], [11]. In a recent approach, the authors derive an expression for the continuous gradient of the POC function, and use different search methods to find the minima of the magnitude of the gradient, which correspond to the extrema of the POC function [18]; since the search is performed around the initial integer-valued solution, this method is able to quickly estimate the location of the POC maximum with good accuracy.
Most of the methods described above perform reasonably well under controlled conditions but their accuracy is seriously degraded by noise, border effects, and the presence of multiple motions. This restricts many of the methods from being applied to some computer vision problems, such as stereo depth or optical flow estimation. Also, most of these methods were originally proposed in order to overcome the limitations of the classic POC method within the context of a specific application, without considering other possible approaches to the estimation of the POC maxima with sub-pixel accuracy.
A few papers have been published where some of the methods described above are compared. For instance, Argyriou and Vlachos perform a comparison by fitting different functions (quadratic, Gaussian, and sinc) around the integer-valued solution, in the context of motion-based video encoding [19]. This work shows that fitting a Sinc function produces the best results among the tested fitting functions, particularly in the presence of white additive noise. In another study, Vera and Torres evaluate five different methods, including up-sampling in the frequency domain, linear fitting in the frequency domain, and Gaussian fitting in the spatial domain, for the estimation of 2D translations under aliasing conditions [20]. The authors conclude that Gaussian function fitting provides results that are less affected by aliasing than the other methods under study. While these works provide a good reference point, they do not include some of the more recent proposals and also lack a detailed explanation of each method under study. Moreover, the batteries of tests reported in these works are relatively limited and sometimes do not show the particular strengths and shortcomings of each method.
In this work, a thorough comparison of six of the most representative methods for the estimation of POC maxima at sub-pixel level is presented, focusing on accuracy, success rate, and robustness to noise and missing data in a translational and rigid image registration framework. In Section 2 we will describe in detail all the methods under study. Section 3 will present some ideas that may increase the performance of some of the methods. In Section 4, the evaluation results for the estimation of translational and rigid transformations will be shown. And finally, Section 5 will present our conclusions.
Section snippets
Methods
In this section we present the most relevant methods for the estimation of POC maxima with sub-pixel accuracy. The description of each method considers both the 1D case (e.g., for stereo matching) and the 2D extension (e.g., for image registration). Extensions to higher dimensions (e.g., for volume registration) are possible, and in some cases straightforward. Throughout the article, square brackets will be used to denote N-periodic discrete-time signals (e.g., f[x], where ), and
Increased robustness
This section describes some general ideas to increase the robustness of the estimated peak locations in the presence of noise and border artifacts which may result from multiple motions. These ideas can be applied and produce significant benefits in most of the methods under study.
Results and discussion
The comparison of the six methods under study was carried out in two stages. In the first stage, all of the discussed methods were implemented in Octave for 1D and 2D signals. For each case, a series of tests were performed, where a known fractional shift d (chosen at random) was applied to a reference signal f(x) in order to obtain a circularly-shifted version . Since d (or its components, in the 2D case) is not necessarily integer, spline interpolation was used to obtain a high
Conclusions
Six methods for the estimation of the phase-correlation maxima with sub-pixel accuracy were evaluated. These methods included: local fitting of quadratic (QuadFit) and sinc functions (SincFit), fitting a linear function in the frequency domain (LinFit), local center of mass (LCM), up-sampling in the frequency domain (UpSamp), and minimization of the magnitude of the POC gradient (GradPOC). Various tests were performed to assess the robustness of each method against additive noise, extreme
Acknowledgments
A. Alba was supported by CONACYT grant #154623. E. Arce was supported by CONACYT grant #168140. The authors would like to thank Otniel Garcia for providing the Face images, Aldo Mejia for the MRI images used in Section 4.4, and El Colegio de la Frontera Sur (ECOSUR) for the Aerial 2 images.
References (29)
- C.D. Kuglin, D.C. Hines, The phase correlation image alignment method, in: Proc. of the IEEE Int. Conf. on Cybernetics...
- et al.
Registration of translated and rotated images using finite fourier transforms
IEEE Trans. Pattern Anal. Mach. Intell.
(1987) - et al.
An FFT-based technique for translation, rotation, and scale-invariant image registration
IEEE Trans. Image Process.
(1996) - et al.
Pseudopolar-based estimation of large translations, rotations, and scalings in images
IEEE Trans. Image Process.
(2005) - et al.
The angular difference function and its application to image registration
IEEE Trans. Pattern Anal. Mach. Intell.
(2005) - et al.
Phase correlation based iris image registration model
J. Comput. Sci. Technol.
(2005) - K. Ito, A. Morita, T. Aoki, T. Higuchi, H. Nakajima, K. Kobayashi, A fingerprint recognition algorithm using...
- Radim Kolar, Viktor Sikula, Michael Base, Retinal image registration using phase correlation, in: Analysis of...
- et al.
A high-accuracy passive 3D measurement system using phase-based image matching
IEICE Trans. Fund. Electron. Commun. Comput. Sci.
(2006) - et al.
Phase-correlation guided area matching for realtime vision and video encoding
J. Real-Time Image Process.
(2014)
A sub-pixel correspondence search technique for computer vision applications
IECIE Trans. Fund.
Subpixel registration with gradient correlation
IEEE Trans. Imag. Process.
Cited by (0)
- ☆
This paper has been recommended for acceptance by Anurag Mittal.