Elsevier

Signal Processing

Volume 170, May 2020, 107434
Signal Processing

Image fusion employing adaptive spectral-spatial gradient sparse regularization in UAV remote sensing

https://doi.org/10.1016/j.sigpro.2019.107434Get rights and content

Highlights

  • Spectrum consistency, which means that changes in spectral direction are always a smooth function, is first investigated in the fusion model.

  • Spatial adaptivity is also introduced to reduce spectral distortion in the fusion process.

  • The separable approximation method and augmented Lagrangian method are employed to solve the optimization problem.

  • Our method is compared with other state-of-the-art fusion algorithms, and good performance is verified by UAV datasets.

  • Furthermore, we apply our fusion algorithm in the application of a vegetation phenotype, which proves its significant research value in UAV remote sensing.

Abstract

Unmanned aerial vehicle (UAV) remote sensing has been widely used in vegetation phenotypes and precision agriculture. The fusion of UAV multispectral and panchromatic images has considerable research value. For example, an accurate vegetation index can be obtained. However, large geometrical distortions are observed in UAV images, contributing to the insufficiency of existing fusion algorithms. Spectrum consistency, which indicates that changes in spectral direction are always a smooth function, is investigated in this paper to solve the above problem. Spatial adaptivity is also introduced to reduce spectral distortion in the fusion process. Based on the two aspects, a multispectral and panchromatic image fusion model employing adaptive spectral-spatial gradient sparse regularization is proposed for UAV remote sensing. The separable approximation and augmented Lagrangian methods are employed to optimize this model. In the experiments, the proposed method is firstly compared with other state-of-the-art fusion algorithms, and good performance is verified by UAV datasets in terms of visual effect and objective quality analysis. Secondly, the fusion algorithm is applied in the application of a vegetation phenotype. The experiments finally demonstrate that accurate vegetation indices can be generated by adopting the proposed algorithm. This finding proves the substantial research value of the proposed algorithm in UAV remote sensing.

Introduction

Unmanned aerial vehicle (UAV) remote sensing is widely used in vegetation phenotypes and precision agriculture [1], [2]. To calculate vegetation indices such as the normalized difference vegetation index (NDVI) [3], typical methods use the original UAV multispectral (MS) image to generate the distribution map of NDVI. The NDVI distribution map may be inaccurate due to the low spatial resolution of the original MS images. MS images normally have high spectral resolution and low spatial resolution. A panchromatic (Pan) image has high spatial resolution and low spectral resolution [4], [5]. Some spatial information is evidently lost in MS images. Therefore, the fusion of MS and Pan images can simultaneously have improved spatial and spectral resolution. The resolution of the NDVI distribution map will then be improved. This improvement is the motivation of the present work.

A large collection of fusion methods was proposed in the last decades. These fusion algorithms are generally divided into four classes: component substitution (CS), multiresolution analysis (MRA), deep learning (DL) and model-based algorithms. The CS methods are performed in the following steps: upsampling, transform, intensity matching, substitution, and reverse transform. In this strategy, the spatial intensity is separated from the spectral information, and then replaced with the Pan image. The typical algorithms in CS methods include intensity hue saturation (IHS) [6], principal component analysis [7], and their combinations. Kang et al. [8] proposed an another kind of CS method named Pansharpening with Matting Mode, which substitutes the alpha channel of the MS image with the Pan image to reconstruct the fusion image perfectly. Generally speaking, CS method can be efficiently implemented and performs well in preserving spatial information, however it always causes serious spectral distortion. To reduce the spectral distortion, MRA-based methods have been widely investigated. These methods extract the high-frequency component of the Pan image, that is, the detail part of the spatial information, and then add the high-frequency component to the MS image to form the fusion image. Examples of these methods are high-pass filtering (HPF) [6], Wavelets [9], Laplacian pyramid (LP) [10], and Á-trous-wavelet-transform-based pansharpening (AWLP) [11]. In 2014, Hamid Reza Shahdoosti et al. [12] proposed an optimal filter-based method (OFMP) that preserves the spectral quality of the expanded MS images and improves the spatial quality by minimizing a tradeoff objective function so that the fusion image can have a better visual effect. Compared to the CS methods, the spectral information of the MRA-based methods could be efficiently preserved but spatial structure is not always satisfactory. DL-based methods form a new branch of fusion methods. A new infrared and visible image fusion method based on generative adversarial networks was proposed in [13]. This method establishes an end-to-end model that avoids the complicated manual design of fusion rules in traditional fusion methods. Inspired by convolutional neural network (CNN)-based super-resolution methods, Masi et al. proposed a pansharpening method by using convolutional neural network (PNN) [14] and achieved promising experiment results. Other similar works included deep residual learning [15] and multi-scale CNN [16] for Pan/MS fusion. DL-based methods can provide good fusion performance, but require a large number of training data to learn specialised models for different types of input samples, thereby sacrificing the flexibility of the fusion method. For example, the DL fusion model learned from the data of the IKONOS satellite may be unsuitable for the Pan/MS fusion of data from other satellites.

In recent years, the model-based methods have attracted a lot of interests in the field of image fusion. To retain more spectral and spatial information, Ballester et al. [17] proposed the variational method called P+XS image fusion, which constructs an energy function based on a certain hypothesis and then optimizes the energy function to obtain the optimal solution. It assumes that the sampled multispectral images are blurred images, and forces the edges of the fusioned image to line up with those in the Pan image. Some model-based methods were carried out using the sparse theory. Specifically, the Pan image is modeled as a linear combination of the ideal multispectral bands, and the MS image is modeled as a blurred and noisy low-resolution version of the ideal MS image. Then the problem is to restore the ideal MS image from its low resolution version, which is an ill-posed inverse problem. As its solution is not unique, sparsity regularization is always added into the fusion model [18]. The total variation (TV) regularization-based image restoration model was introduced by Rudin et al. [19]. It is based on the minimization of energy, which is a combination of two types of information: a data-fitting term and a gradient-sparsity regularization term. It has a good performance in preserving the image and edges while removing noise (e.g., [20], [21]). In 2014, Palsson proposed a Pansharpening algorithm based on TV [22]. Ma et al. [23], [24] proposed a fusion strategy for infrared (IR) and visible images based on TV minimization. Chen et al. [25], [26] formulated image fusion as a convex optimization problem that minimizes a linear combination of a least-squares fitting term and a dynamic gradient sparsity sparse function (SIRF) to enhance fusion performance. These model-based methods have-super resolution capability, and can acquire higher spatial and spectral resolutions with less spetral distortion.

Although the existing fusion algorithms work well in some respects, preserving spectral information and enhancing spatial information remain challenging problems in image fusion of remote sensing, especially for UAV. Different from satellite remote sensing, a UAV flies close to surface targets, and fisheye lenses are always used to expand the field of view. This condition leads to large geometrical distortions and errors in the registration process, which may introduce considerable spectral distortion in the fusion results. Therefore, some vegetation indices, which are calculated based on the ratio of the MS image band, are inaccurate. Signal changes in the spectral direction are always a smooth function, which is called ‘spectrum consistency’ in this paper. Spectrum consistency can be explained as follows: the responses of adjacent spectral bands are always close. Thus, gradients in the spectral direction are mostly small. Fig. 1 (a) shows an example of the distribution of MS image bands. In Fig. 1 (b), the blue line indicates a one-dimensional signal in the spectral direction from Fig. 1 (a). The red line is an example of ‘spectrum inconsistency’, which can be used to describe a situation of considerable spectral distortion. Spectrum consistency, which can be employed to reduce the spectral distortion, was never considered in previous fusion methods.

In this paper, we propose a novel image fusion method employing adaptive spectral-spatial gradient sparse regularization (ASSGSR). We assume that the downsampled fusion image should be close to the input MS image. A least-squares fitting term of the fusion and the input MS image is formulated to maintain the spectral information. A gradient-sparsity constrained term is used to maintain the spatial information of the Pan image in the fusion image. The main contributions of the proposed method can be summarized as follows:

(1) For registered MS images and Pan images, the gradient information should not only be group sparse in the spectral and spatial directions but should also be spectrum consistent. The property of spectrum consistency is further formulated into a fusion model, which is used to reduce the spectral distortion. We believe that this is the first time that this property has been considered in image fusion.

(2) In addition, gradient sparsity indicates that the main difference between the Pan and MS images is always in the region of edges. Therefore, we introduce spatial adaptivity into the fusion process. Small weights are selected for the edge regions in the gradient-sparsity constrained term, ensuring consistency of the spectral information with the MS pixels. This condition will reduce spectral distortion. The weight of the gradient-sparsity constrained term in the smooth region will be large to maintain the similarity of the spatial information of the fusion image to that of the Pan image.

(3) Based on the above two aforementioned aspects, a novel fusion model is proposed. The separable approximation and augmented Lagrangian methods are employed to solve a convex optimization problem.

(4) In an experiment, we demonstrate that accurate vegetation indices can be generated by adopting the proposed fusion algorithm. The present study may be the first work of image fusion in the area of vegetation phenotypes for UAV remote sensing.

The content of this paper is arranged as follows: in Section 2, we propose the image fusion algorithm based on adaptive spectral-spatial gradient sparse regularization. The experimental results employing satellite datasets and UAV datasets are shown in Section 3 and Section 4, respectively. Finally, we conclude the paper in Section 5.

Section snippets

Notation

For convenience, scalars are denoted by lowercase letters, and matrices are denoted by bold capital letters. Lowercase bold letters denote the stacked column vectors of a matrix. PRm×n denotes a Pan image and M=[M1,,Mb]Rmc×nc×b denotes an MS image, where m and n represent the numbers of rows and columns, and b represents the number of spectral bands. c denotes the ratio of resolution of Pan images to MS images. For instance, in the Quickbird satellite multispectral data, the resolution of

Fusion experiments

We use real data captured by an MS camera from the Parrot Sequoia UAV in the following experiments. The Parrot Sequoia camera includes high- and low-spatial-resolution RGB and MS sensors, respectively. The MS sensor contains infrared (735 ± 10 nm), near-infrared (790 ± 40 nm), red (660 ± 40 nm) and green (550 ± 40 nm) bands. The two types of images should be accurately aligned before fusion [33], [34]. The RGB image is converted to greyscale image to obtain the Pan image. For convenience, only

Vegetation phenotype experiments

The experiments in Section 3 demonstrate that the ASSGSR method has good performance in image fusion for UAV remote sensing. This good performance motivates the application of this method to vegetation phenotypes. For example, NDVI is a normalized index that measures the vegetation coverage considering the near-infrared and red bands of the MS image. Considering that MS and fusion images respectively have low and high spatial resolutions, the NDVI obtained by ASSGSR will be more accurate than

Conclusions

An image fusion strategy is investigated in the UAV remote sensing in this paper. UAV remote sensing images always demonstrate serious geometrical distortion, makeing the registration of MS image with the Pan image difficult. Therefore, traditional fusion methods do not work well, which causes significant spectral distortion in the fusion results. Notably, the gradient information of registered MS and Pan images should not only be group sparse in the spectral and spatial directions but should

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The authors gratefully acknowledge the financial supports from the National Natural Science Foundation of China (grant No. 61971315), the Hubei Provincial Natural Science Foundation of China (grant No. 2018CFB435), and the Fundamental Research Funds for the Central Universities (grant No. 2042018kf1009).

References (43)

  • X. Kang et al.

    Pansharpening with matting model

    IEEE Trans. Geosci. Remote Sens.

    (2014)
  • J. Zhou et al.

    A wavelet transform method to merge landsat TM and SPOT panchromatic data

    Int. J. Remote Sens.

    (1998)
  • B. Aiazzi et al.

    A comparison between global and context-adaptive pansharpening of multispectral images

    IEEE Geosci. Remote Sensing Lett.

    (2009)
  • Y. Kim et al.

    Improved additive-wavelet image fusion

    IEEE Geosci. Remote Sens. Lett.

    (2011)
  • H.R. Shahdoosti et al.

    Fusion of MS and PAN images preserving spectral quality

    IEEE Geosci. Remote Sens. Lett.

    (2014)
  • G. Masi et al.

    Pansharpening by convolutional neural networks

    Remote Sens. (Basel)

    (2016)
  • Y. Wei et al.

    Boosting the accuracy of multispectral image pansharpening by learning a deep residual network

    IEEE Geosci. Remote Sens. Lett.

    (2017)
  • Q. Yuan et al.

    A multiscale and multidepth convolutional neural network for remote sensing imagery pan-sharpening

    IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.

    (2018)
  • C. Ballester et al.

    A variational model for p+XS image fusion

    Int. J. Comput. Vis.

    (2006)
  • S. Li et al.

    A new pan-sharpening method using a compressed sensing technique

    IEEE Trans. Geosci. Remote Sens.

    (2011)
  • L.I. Rudin et al.

    Nonlinear total variation based noise removal algorithms

    Eleventh International Conference of the Center for Nonlinear Studies on Experimental Mathematics: Computational Issues in Nonlinear Science: Computational Issues in Nonlinear Science

    (1992)
  • Cited by (31)

    • On the use of information fusion techniques to improve information quality: Taxonomy, opportunities and challenges

      2022, Information Fusion
      Citation Excerpt :

      The Prediction systems category includes a generated e-waste (waste electrical and electronic equipment) prediction system, providing a roadmap with a procedural guideline to enhance e-waste estimation studies [92] and a system where hourly NO2 (nitrogen dioxide) concentration can be estimated accurately with data fusion techniques [74]. The Multimedia system category includes papers that propose special-purpose multi-focus [67], drone [79], underwater [115], infra-red and visible [111], and multi-band [70] image fusion systems of different types and papers that use fusion as an aid to support broader systems with other goals, like X-ray imaging [110], image recognition [103] or rendering [97] and video coding [77,63]. Most of the papers within the sensor-based systems category are developed within the field of WSN generally [117,85,66,81,100] or within the particular areas of the IoT [112,73], limited resource scenarios [95,96], planetary exploration [60], water quality sensing [75] or micro-grid systems [76].

    • Recent advances in image fusion technology in agriculture

      2021, Computers and Electronics in Agriculture
      Citation Excerpt :

      As a result, it has the potential to be used in crop and livestock monitoring. A branch of information fusion (Du et al., 2013), image fusion was developed in the late 1970s and has since been used widely in remote sensing (Zhang et al., 2020; Zhang et al., 2020), medicine (Hu et al., 2020; Wang et al., 2020), and the military (Muller and Narayanan, 2009). Image fusion techniques take the images as a research object, and use image-related information from the same scene captured from multiple sensors to output a fused image that is suitable for human visual perception or further processing and analysis by a computer (Pohl and Van Genderen, 1998; Pradham et al., 2008).

    • A novel fusion method based on dynamic threshold neural P systems and nonsubsampled contourlet transform for multi-modality medical images

      2021, Signal Processing
      Citation Excerpt :

      SR-based fusion methods first identify the sparsest representation of the multi-modality images in a given dictionary before integrating the representation coefficients according to a fusion rule; then, the fused image is constructed by combining the given dictionary and fused sparse coefficients. In recent years, several SR-based fusion methods have been investigated [29–32]. However, only one dictionary is employed in the SR-based methods.

    • Signal-domain Kalman filtering: An approach for maneuvering target surveillance with wideband radar

      2020, Signal Processing
      Citation Excerpt :

      Unmanned aerial vehicles (UAV) has attracted considerable attention in military and commercial market including reconnaissance, combat missions, general surveying, wildlife management, commercial delivery [1–5].

    • Multi-focus image fusion method based on two stage of convolutional neural network

      2020, Signal Processing
      Citation Excerpt :

      In recent years, pixel-level fusion methods based on dense scale invariant feature transform (DSIFT) [2], based on guided filtering (GF) [3,4] and homogeneous similarity [5] all perform well in extracting and representing image details. Based on the multi-spectral image fusion method, an adaptive spectral spatial sparse regularization is proposed [6]. However, the artificial design of activity measures and fusion rules of these methods are difficult to set, and there are many factors that cannot be fully taken into account.

    View all citing articles on Scopus
    View full text