Robust visible-infrared image matching by exploiting dominant edge orientations

doi:10.1016/j.patrec.2018.10.036

Pattern Recognition Letters

Volume 127, 1 November 2019, Pages 3-10

https://doi.org/10.1016/j.patrec.2018.10.036 Get rights and content

Highlights

•
Propose a rotation invariant descriptor for visible and infrared image matching.
•
Encode edge information into the descriptor statistically based on multi-orientation and multi-scale Log-Gabor filters.
•
Estimate the dominant orientation of the descriptor by using accumulated edge orientations.
•
The descriptor is robust to the discrepancies in the electromagnetic wavelengths between visible and infrared images.

Abstract

Finding the correspondences between visible and infrared images is a challenging task due to the image spectral inconsistency which leads to large differences of gradient distributions between these images. To alleviate this problem, we propose a novel feature descriptor for visible and infrared image matching based on Log-Gabor filters. The descriptor employs multi-orientation and multi-scale Log-Gabor filters to encode the edge information statistically. Furthermore, the descriptor provides rotation invariance by estimating the dominant orientation which is based on accumulated edge orientations. The experimental results demonstrate the effectiveness of the proposed rotation invariant descriptor for matching visible and long wave infrared images as compared with state-of-the-art descriptors.

Introduction

Automatic image matching, as a fundamental problem in computer vision and image processing, has been studied and made great progress in recent decades [16], [24], [26], [29], [30], [31], [32], [35]. The matching of visible and infrared images can provide complementary information and play an important role in extensive applications [13], such as video surveillance, urban monitoring, driver assistance systems. However, it still remains some difficulties in matching visible and infrared images automatically because of their discrepancies in the electromagnetic wavelengths. Visible images capture reflected light, whereas infrared images depict thermal radiation. The dramatic appearance changes between the visible and infrared image pairs usually include large gradient differences or missing texture details in one of the images, which will cause the inconsistent description for the correspondences. For instance, Fig. 1 illustrates a visible and infrared image pair exhibiting large changes in appearance. It reveals that SIFT [16], which is highly invariant to a range of geometric and photometric transformations, performs poorly in such intractable situations as shown in Fig. 1.

The traditional approaches addressing the large appearance differences between images usually use the mutual information [28] or the sum of squared differences (SSD) [21], [25], which are known as the area-based image matching methods. These methods can match image patches accurately as all pixels are taken into account. However, these methods are sensitive to geometric distortions and computationally expensive.

It would be more challenging for identifying two image patches when there exists a rotation transformation between them. For visible image matching, extracting feature vectors [16] (a.k.a descriptors) from images is effective and computationally efficient. The rotation differences between image patches are solved by estimating the dominant orientations from the image patches. The dominant orientations can be used to re-arrange the descriptors, which would make the descriptors invariant with respect to image rotations.

However, different from visible image matching, the low-level information (e.g., gradients) will be affected by the differences between visible images and infrared images, thus causing incorrect dominant orientation estimations and mis-matches. For example, Fig. 2a shows the similar gradient distributions between visible images which could estimate the similar dominant orientations, leading to good matching performance [16]. While in Fig. 2b, the disparate dominant orientations estimated by gradients between the visible and infrared image pair are misleading for correspondence recognition. Other feature descriptors based on either straight line segments [11], [15] or edges [1], [2], [20] show improved matching accuracy in some extent comparing with the gradient based methods. However, those descriptors are not robust to image rotation invariance.

By investigating visible and infrared images, we notice that although they have dissimilar gradients at pixel level, the global structure and shape features such as edges tend to remain reasonably invariant and the orientation of edges can be a cue to establish the similarity between visible and infrared images. For the corresponding image patches between visible and infrared images, the distributions of local edge orientations are similar. For instance, pixels on horizontal edges in one image also lie on the corresponding horizontal edges in the other image, which can establish the similar relations between corresponding pixels. We describe the pixel using the orientation of edges instead of intensity or gradient and construct the orientation histograms of pixels statistically to design the descriptor. When there exist relatively rotation distortions between visible and infrared images, all the edges in one image have the same rotation angle comparing with the corresponding edges in the other image. Among the different orientations within a local image patch, we use the accumulated maximum orientation as the dominant orientation. Multi-orientation and multi-scale Log-Gabor filters can be a good tool to extract edge information [17].

Motivated by the observation mentioned above, we propose a new rotation invariant descriptor for visible and infrared image matching on the basis of multi-orientation and multi-scale Log-Gabor filters. Our proposed descriptor can handle the rotations between visible and infrared images by estimating the reliable dominant orientation. We evaluate and compare the proposed descriptor with several state-of-the-art descriptors in challenging datasets.

The rest of this paper is organized as following: Section 2 gives a brief overview of related work. Section 3 presents the proposed rotation invariant descriptor for matching visible and infrared images in details, followed by the experiments in Section 4. Finally, conclusions are presented in Section 5.

Section snippets

Related work

Most existing prior work related with image matching can be generally divided into two categories: area-based methods and feature-based methods [36].

Area-based methods typically measure the distance of corresponding pixels from different images. Mutual information (MI) [28] is a widely used image matching method especially in the area of medical image matching [22], which aims to find a statistical pixel intensity relationship and maximize the amount of shared information between images. The

Method

Our goal is to construct a rotation invariant descriptor for visible and infrared image matching. The widely used solution for estimating an orientation of a feature descriptor is to use the dominant orientation of SIFT [16], however, SIFT is difficult to estimate the reliable dominant orientations between the corresponding visible and infrared image patches. In addition, gradient based approaches are also unsuitable for constructing the feature descriptors in such situations. To address these

Experiments

In this section, we firstly present the challenging datasets and evaluation protocols, then evaluate the parameter settings of Log-Gabor filters and justify our implementation strategies when constructing the descriptor. Finally, we test our approach against several state-of-the-art methods.

Conclusion

This paper is mainly focused on automatic image matching of visible and infrared images, and a novel feature descriptor is proposed to improve the matching performance. The proposed method statistically encodes the edge information into the descriptor by exploring the properties of multi-orientation and multi-scale Log–Gabor filters. Besides, the descriptor is assigned to dominant orientation to ensure image rotation invariance and then is formed by joining the histograms of the responses with

References (36)

J. Han et al.
Rotation-invariant and scale-invariant gabor features for texture image retrieval
Image Vis. Comput.
(2007)
J. Han et al.
Visible and infrared image registration in man-made environments employing hybrid visual features
Pattern Recognit. Lett.
(2013)
Z. Miao et al.
Interest point detection using rank order log filter
Pattern Recognit.
(2013)
Y. Ye et al.
A local descriptor based registration method for multispectral remote sensing images with non-linear intensity differences
ISPRS J. Photogramm. Remote Sens.
(2014)
B. Zitova et al.
Image registration methods: a survey
Image Vis. Comput.
(2003)
C. Aguilera et al.
Multispectral image feature points
Sensors
(2012)
C.A. Aguilera et al.
Lghd: a feature descriptor for matching across non-linear intensity variations
IEEE International Conference on Image Processing (ICIP), 2015
(2015)
J. Arrospide et al.
Log–gabor filters for image-based vehicle verification
IEEE Trans. Image Process.
(2013)
F. Barreraaaba
Multispectral piecewise planar stereo using manhattan-world assumption
Pattern Recognit. Lett.
(2013)
H. Bay et al.
Surf: Speeded up robust features
European Conference on Computer Vision
(2006)

M. Brown et al.

Multi-spectral SIFT for scene category recognition

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011

(2011)

D. Firmenichy et al.

Multispectral interest points for rgb-nir image registration

18th IEEE International Conference on Image Processing (ICIP), 2011

(2011)

S. Fischer et al.

Self-invertible 2d log-gabor wavelets

Int. J. Comput. Vis.

(2007)

M.A. Fischler et al.

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

Commun. ACM

(1981)

J. Huang et al.

Multimodal image matching using self similarity

Applied Imagery Pattern Recognition Workshop (AIPR), 2011 IEEE

(2011)

S.-H. Jung et al.

Egomotion estimation in monocular infra-red image sequence for night vision applications

IEEE Workshop on Applications of Computer Vision, 2007. WACV’07

(2007)

P. Kovesi

Image features from phase congruency

Videre: J. Comput. Vis. Res.

(1999)

Y.P. Kwon et al.

Dude (duality descriptor): a robust descriptor for disparate images using line segment duality

2016 IEEE International Conference on Image Processing (ICIP)

(2016)

Cited by (10)

Temporal comparison of construction sites using photogrammetric point cloud sequences and robust phase correlation
2020, Automation in Construction
Citation Excerpt :
Apart from phase congruency, other methods have been applied for area-based image registration. In [45], the multi-scale and multi-orientation Gabor filter is introduced to encode the information of object edges, providing feature descriptions to align infrared and RGB images. In [59], normalized mutual information is extracted via a grayscale weighted window selection strategy, which provides an adaptive and robust feature description for registering images.
Registration of multi-temporal data is an important task when conducting construction monitoring and analysis using 3D point clouds acquired at different time points. However, due to the complexity of scenes in construction sites and the intrinsic attributes of the datasets (e.g., noise, outliers, and uneven densities), the registration of multi-temporal datasets is a challenging task. To this end, in this paper, we propose a fast and marker-free method for coarse registration of multi-temporal point clouds by integrating a projection-based strategy with Fourier-based signal matching methods, consisting of three major steps: decoupling of 3D transformation with projection-based dimensionality reduction, estimation of horizontal transformation by matching 2D horizontally projected images, and estimation of vertical translation by matching 1D vertical signals. In the decoupling step, the principal projection plane is firstly identified through a robust plane fitting method. Then, each point cloud is decomposed into two independent parts according to the estimated projection plane, namely a 2D image and a 1D signal. The 2D image is obtained by mapping 3D points to the principal projection plane along the vertical direction. Correspondingly, the 1D signals are attained through projecting 3D points to the principal axis in the horizontal direction. In the following step, the rotation, scaling, and translation parameters between 2D images of different point clouds are decoupled with Fourier-Mellin transform. These transformation parameters are then solved with a robust 2D phase correlation algorithm. Simultaneously, 1D signals of non-registered point clouds are matched with a robust 1D phase correlation approach. Finally, combining estimated transformation parameters from both 2D images and 1D signals, the multi-temporal point clouds can be efficiently registered. Our registration method is tested using one TLS benchmark dataset and two multi-temporal point cloud sequences from different construction sites to validate the versatility and effectiveness of the proposed workflow. In terms of the registration accuracy, our proposed method can achieve satisfying performance, with accurate registration for both rotation and translation estimation.
A Novel Multispectral Line Segment Matching Method Based on Phase Congruency and Multiple Local Homographies
2022, Remote Sensing
Research Trend Analysis for EO-IR Image Registration
2022, International Conference on Control, Automation and Systems
Local Feature Descriptor for Multispectral Image Matching of a Large-Scale PV Array
2022, 2022 IEEE International Conference on Automatic Control and Intelligent Systems, I2CACIS 2022 - Proceedings
Whale optimization algorithm based on lateral inhibition for image matching and vision-guided AUV docking
2021, Journal of Intelligent and Fuzzy Systems
Modified whale optimization algorithm for underwater image matching in a UUV vision system
2021, Multimedia Tools and Applications

View all citing articles on Scopus

View full text

Robust visible-infrared image matching by exploiting dominant edge orientations

Highlights

Abstract

Introduction

Section snippets

Related work

Method

Experiments

Conclusion

Image Vis. Comput.

Pattern Recognit. Lett.

Pattern Recognit.

ISPRS J. Photogramm. Remote Sens.

Image Vis. Comput.

Multispectral image feature points

Sensors

Lghd: a feature descriptor for matching across non-linear intensity variations

IEEE International Conference on Image Processing (ICIP), 2015

Log–gabor filters for image-based vehicle verification

IEEE Trans. Image Process.

Multispectral piecewise planar stereo using manhattan-world assumption

Pattern Recognit. Lett.

Surf: Speeded up robust features

European Conference on Computer Vision

Multi-spectral SIFT for scene category recognition

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011

Multispectral interest points for rgb-nir image registration

18th IEEE International Conference on Image Processing (ICIP), 2011

Self-invertible 2d log-gabor wavelets

Int. J. Comput. Vis.

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

Commun. ACM

Multimodal image matching using self similarity

Applied Imagery Pattern Recognition Workshop (AIPR), 2011 IEEE

Egomotion estimation in monocular infra-red image sequence for night vision applications

IEEE Workshop on Applications of Computer Vision, 2007. WACV’07

Image features from phase congruency

Videre: J. Comput. Vis. Res.

Dude (duality descriptor): a robust descriptor for disparate images using line segment duality

2016 IEEE International Conference on Image Processing (ICIP)