Elsevier

Information Sciences

Volume 545, 4 February 2021, Pages 381-402
Information Sciences

The PAN and MS image fusion algorithm based on adaptive guided filtering and gradient information regulation

https://doi.org/10.1016/j.ins.2020.09.006Get rights and content

Abstract

In recent years, with the improvement in the accuracy of remote sensing image classification and target recognition, the feature level fusion technology of remote sensing images has attracted much attention and become a research hotspot. However, this kind of fusion technology is not as mature as pixel-level fusion technology, and there are still many problems to be solved. This paper proposes a multi-spectral (MS) and panchromatic (PAN) image fusion algorithm based on adaptive textural feature extraction and information injection regulation. The fusion algorithm includes two stages. The first stage extracts the textural details of high-resolution PAN images. In this stage, based on the sensitivity of the remote sensing images to the gray-level co-occurrence matrix (GLCM), an adaptive guided filter (AGIF) scheme for remote sensing images based on the GLCM is proposed. The feature information of the textures and details of the PAN image was fully extracted. The second stage injects the extracted feature information of the PAN image into an MS image. In this stage, a decision map based on the MS image gradient domain and a weighted matrix based on the gradient entropy measure were proposed in order to, respectively, realize the adaptability of the feature injection location selection and regulate the intensity of the injected information to the MS image. This ensures the rationality of the injection of the textural information and avoids noise, patches and other information interference. The proposed algorithm has the advantages of fully extracting the textural features of high-resolution PAN images, adaptively adjusting the injection position and intensity when injecting the feature information into an MS image, and providing the fused image with clear features. On the premise of effectively maintaining the spectral information quality, the spatial resolution of the fused image is improved. A large number of simulation experiments verify the effectiveness of the proposed method.

Introduction

Multi-source image fusion technology refers to the process of merging two or more images from different sensors into one image to obtain more complete information than any single remote sensing image, and it has great significance to the image processing tasks in computer vision area [12], [34]. As an important branch of multi-source image fusion technology, remote sensing image fusion technology aims to obtain more information from remote sensing images with relevant feature information by complementing the information under certain constraints. To make up for the limitations of hardware and environment, it is impossible to obtain remote sensing images with both high spectral resolution and high spatial resolution. Remote sensing image fusion is divided into three levels: pixel level fusion, feature level fusion and decision level fusion.

Traditional pixel-based fusion technology can take into account the characteristics of the remote sensing images which obtained by different sensors and focus on the statistical analysis of the pixel information in order to obtain fusion images. According to the scope of the fusion operation, the method can be divided into two methods: fusion based on the spatial domain and fusion based on the transform domain [29]. The fusion method based on the spatial domain completes the fusion process by directly using the spatial domain pixels of the images, such as the Intensity-Hue-Saturation (IHS) transformation, principal component analysis (PCA), Gram Schmidt (GS) transformation and Brovey transformation. These methods usually map MS images to a feature space, by using PAN image replace its highest correlation component to improve the spatial resolution of the fused image, these types of methods are generally simple algorithms and have fast calculation speed. However, the fused image is usually degraded, such as by contrast reductions, and has a poor fusion effect. Later, the FIHS algorithm was proposed, although the quality of the fused image was improved, the application range was limited due to the limitations of sensor types. The fusion method based on transform domain transforms the image from the spatial domain to the transform domain with multi-scale analysis tools and sparse representation. Then, it uses the characteristics of the high frequency subband and the low frequency subband in the transform domain to determine the corresponding fusion rules. The processing of this method is similar to the process of the human visual system (HVS) when recognize objects. The background information of the image is well preserved in the low frequency sub-band, and the textural edge information of the image is preserved in the high frequency sub-band. Then, it effectively retains the background information and spectral information of the spectral image, and the textural details of the high-resolution image are well integrated into it by processing the low frequency and high frequency sub-bands. Therefore, the fused images have great quality [14], [7], [20], [3]. Traditional multi-scale methods were used in remote sensing image fusion area, such as contourlet transform, shearlet transform, wavelet transform, etc. Subsequently, NSCT and NSST transformation were proposed, and the quality of the fused image has been improved greatly.

In recent years, with the development of remote sensing technology, the textural edge information in high-resolution images is becoming more complex, and it is also affected by the acquisition environment. Remote sensing images are usually polluted by patches, noise and shadows during the acquisition and transmission processes, which affect the final fusion quality in degrees. However, the requirements for the accuracy of the feature extraction, classification and target recognition of the ground objects have become higher in applications, such as land cover and land use mapping. In general, better descriptions of ground objects should be obtained by using the feature classification of fused images and increasing the spatial resolution of remote sensing images [23]. All of these lead to the research of feature level and decision level fusion of remote sensing images becoming a research hotspot [4], [17], [27].

Feature level fusion technique uses a set of image pixels to form a continuous region, from which one can extract different features or classify features, which may be pixel intensities, edges or textural features in different types of images from the same geographic location. It can be extracted using methods such as the chromaticity information transformation, guided filtering. Then, the feature information is fused to improve the spatial information. For example, in remote sensing image fusion based on feature sets, the literature [24] extracts textural features, shape features and spectral information from hyperspectral images and fuses them into feature sets by adopted stacking as the fusion scheme. Further, using the support vector machine (SVM) classifier with a cubic polynomial kernel can improve the classification accuracy. Song et al. [30] proposed a learning-based image fusion method, which combined the band width and spectral characteristics of the Landsat Thematic Mapper (TM)/Enhanced Thematic Mapper Plus (ETM+) and the spatial resolution of the SPOT5 (Système Pour l 'Observation de la Terre 5) to obtain better fusion results than traditional methods. However, this method need to be further improved with respect to the spatial details of the TM. Bai et al. [2] proposed a softmax regression-based feature fusion method that learns distinct weights for different features. This method uses the shape, spectral, and textural features to classify objects and gets good results. Liu et al. [15] proposed a two-stream fusion network (TFNet) to address the problem of pansharpening. Unlike previous CNN (Convolutional Neural Networks) method that consider pansharpening as a super resolution problem and perform pansharpening at the pixel level, the proposed TFNet fuses the PAN and MS images at the feature level and reconstructs the pansharpened image from the fused features.

In decision level image fusion, Luo et al. [19] proposed decision-based fusion for the pansharpening of remote sensing images. This method merged the à trous wavelet pansharpening [25] and Laplacian pyramid adaptive pansharpening [1] methods in order to take advantage of both methods by locally selecting the best one. However, due to the segmentation errors that are caused by the level line methods, the methods cannot always provide the best fusion results. In literature [31], a decision level fusion using multiview very-high resolution (VHR) imagery is used in an object recognition strategy. In order to refine the classification results, the classified objects of all views are fused together at the decision level fusion based on the scenic contextual information. Tuia et al. [33] proposed a probabilistic discriminative graphical model relying on conditional random fields for the fusion of the land-cover and land-use classification results from very high-resolution remote sensing images. The system integrates the pixel-based and region-based strategies with a multiscale approach and it is able to find agreement between the probabilistic decisions with multiple spatial supports.

Although the remote sensing image fusion technology based on the feature level and decision level is addressed here, the fusion accuracy and fusion effect of this method still need to be improved. Many problems still need to be solved, such as the selection and assessment of valid features, the information loss in the feature extraction process, the reasonable selection of the injected features, the simplification of the fusion rules and the evaluation methods of the universal fusion results.

In this paper, a multispectral (MS) and panchromatic (PAN) image fusion algorithm based on gradient domain decision map adaptive guided filtering is proposed. The innovations are mainly reflected in the following aspects. 1) An adaptive guided filter (AGIF) based on the gray-level co-occurrence matrix (GLCM) is proposed in order to extract the textural and detailed feature information of PAN images, which ensures the sufficiency of the information extraction. 2) A decision map based on the gradient domain is proposed to supervise the injection position of the textural and detail feature information and ensure the spectral quality of the fused MS image. 3) A weighted matrix constraint scheme based on the gradient entropy measure was proposed to regulate the injection intensity of the PAN feature information, avoid the interference of noise, patches and other information, and enhance the spatial resolution of the fused images. 4)A new MS and PAN fusion scheme based on gradient domain decision template adaptive guided filtering is proposed. The proposed algorithm has the advantages of the textural features of high-resolution PAN images extracted sufficiently, adaptively adjusting the position and intensity when the feature information is injected into a MS image, and having clear textures in the fused image. The advantage of the method based on detail injection into MS image is that on the premise of effectively maintaining the spectral information quality, the spatial resolution of the fusion image is improved. The injection method is not restricted by the relative spectral influence relation between bands, and the fusion can be carried out by using single band of multispectral image and panchromatic image, it is helpful to improve the universality of the fusion method. A large number of simulation experiments verify the effectiveness of the proposed method.

Section snippets

Analysis of guided filtering

Guided image filtering (GIF) is a smoothing filter for edge preservation that was proposed by He et al. [10]. The filter completes the filtering process through a guide image, and the guide image can either be a single image itself or an input texture map. When the guide image is the latter, GIF can well maintain the edge. Because GIF can avoid ringing effects, it is widely used in image processing fields such as feature extraction, image fusion and super-resolution reconstruction.

Let the input

The proposed adaptive guided filtering (AGIF) based on the GLCM

Based on the above analysis of the traditional GIF model and the discussion of the GLCM characteristics for MS and PAN textural analysis, this section proposes an adaptive guided filtering model for remote sensing images based on the GLCM.

Let k be the current pixel to be processed, wk is the window centered on k and M is the number of pixels of the guide image; cor, ent and asm correspond to the correlation, entropy and angular second moment of the GLCM in the current window, respectively. The

Algorithm flow chart

The flow chart of the algorithm that is proposed in this paper is shown in Fig. 8.

Algorithm flow steps

Let the MS image that is to be fused after registration be FMS and the PAN image that is to be fused be FPAN. The specific implementation process of the fusion between FMS and FPAN is as follows:

  • Step 1. Performing IHS transformation on MS image, and extract its intensity component I and retain its spectral and chrominance components.

  • Step 2. Let PAN be the input matrix. Calculate the edge guidance function GPAN by

Experiment and analysis

In order to verify the effectiveness of the MS and PAN image fusion algorithm that is proposed in this paper, the experiment part is divided into two groups: the degraded data experiments and the real experiments. We sampled 100 sets of test images of different sizes from the Quickbird02 and Worldview02 data sets for testing and the example images are shown in Fig. 9, the images in the data set to be tested cover a variety of scenes, including vegetation areas (such as forests and farmland),

Conclusion

This paper proposes an adaptive guided filtering algorithm based on the gradient domain decision map for fusing MS and PAN images. According to the textural features of PAN images, an AGIF model based on the GLCM is proposed in order to extract the textural features from PAN images. In order to maintain the spectral quality of the fusion image and inject the most possible textural information, this paper sets the injection location of the feature information by determining the decision map from

CRediT authorship contribution statement

Xianghai Wang: Conceptualization, Methodology, Funding acquisition. Shifu Bai: Methodology, Validation, Writing - original draft. Zhi Li: Formal analysis, Writing - original draft. Yuanqi Sui: Writing - review & editing. Jingzhe Tao: Investigation, Methodology, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This study was funded by the National Natural Science Foundation of China (Grant Nos. 41671439, 41971388) and Innovation Team Support Program of Liaoning Higher Education Department (Grant No. LT2017013).

References (36)

  • C. Han et al.

    A remote sensing image fusion method based on the analysis sparse model

    IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.

    (2016)
  • R.M. Haralick et al.

    Texture features for image classification

    IEEE Trans. Syst. Man Cybern..

    (1973)
  • K. He et al.

    Guided image filtering

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • F. Kou et al.

    Gradient domain guided image filtering

    IEEE Trans. Image Process.

    (2015)
  • H. Li et al.

    Infrared and visible image fusion using a deep learning framework

    Computer Vision Pattern Recognition.

    (2018)
  • M. Li et al.

    Review of image fusion algorithm based on multiscale decomposition

  • X. Liu et al.

    Remote sensing image fusion based on two-stream fusion network

  • Y. Liu et al.

    Simultaneous image fusion and denoising with adaptive sparse representation

    Image Processing Iet.

    (2014)
  • Cited by (0)

    View full text