Elsevier

Information Fusion

Volume 18, July 2014, Pages 148-160
Information Fusion

Spatio-spectral fusion of satellite images based on dictionary-pair learning

https://doi.org/10.1016/j.inffus.2013.08.005Get rights and content

Abstract

This paper proposes a novel spatial and spectral fusion method for satellite multispectral and hyperspectral (or high-spectral) images based on dictionary-pair learning. By combining the spectral information from sensors with low spatial resolution but high spectral resolution (LSHS) and the spatial information from sensors with high spatial resolution but low spectral resolution (HSLS), this method aims to generate fused data with both high spatial and spectral resolution. Based on the sparse non-negative matrix factorization technique, this method first extracts spectral bases of LSHS and HSLS images by making full use of the rich spectral information in LSHS data. The spectral bases of these two categories data then formulate a dictionary-pair due to their correspondence in representing each pixel spectra of LSHS data and HSLS data, respectively. Subsequently, the LSHS image is spatial unmixed by representing the HSLS image with respect to the corresponding learned dictionary to derive its representation coefficients. Combining the spectral bases of LSHS data and the representation coefficients of HSLS data, fused data are finally derived which are characterized by the spectral resolution of LSHS data and the spatial resolution of HSLS data. The experiments are carried out by comparing the proposed method with two representative methods on both simulation data and actual satellite images, including the fusion of Landsat/ETM+ and Aqua/MODIS data and the fusion of EO-1/Hyperion and SPOT5/HRG multispectral images. By visually comparing the fusion results and quantitatively evaluating them in term of several measurement indices, it can be concluded that the proposed method is effective in preserving both the spectral information and spatial details and performs better than the comparison approaches.

Introduction

A specific feature of remote sensing images, captured from different satellites and different sensors, is a tradeoff between spatial resolution and spectral resolution. This is caused, on the one hand, by the system tradeoffs related to data volume and signal-to-noise ratio (SNR) limitations and, on the other hand, by the specific requirements of different applications for a high spatial resolution or a high spectral resolution. For example, to fulfill the high spatial resolution requirement in many land-oriented applications, sensors with spatial resolution of half meter to tens of meters are designed, including, but not limited to, ETM+(30 m) on a platform of Landsat, the sensor on QuickBird (2.4 m for multispectral bands), and the instruments on SPOT (2.5 m–10 m). Sensors like the MODerate resolution Imaging Spectroradiometer (MODIS) on-board Aqua or Terra, the MEdium Resolution Imaging Spectrometer (MERIS) on-board ENVISAT, and the Hyperion instrument on-board the EO-1 satellite, can provide remote sensing images with high spectral resolution but with low spatial resolution. Although these remote sensing instruments have many different properties, such as the revisit period, the swath width, the purpose of the launch (commercial or science), a specific user can obtain large amounts of data from different instruments on a given study area. This promotes the development of new algorithms to obtain remote sensing data with the best resolution available by merging their complementary information.

From the application point of view, remote sensing data with high spatial resolution are beneficial for the interpretation of satellite images and the extraction of spatial details of land cover, such as in land use/cover mapping and change detection [1]; whereas remote sensing data with high spectral resolution are capable of identifying those targets that cannot be easily distinguished by human eyes, such as in geological analysis and chemical contamination analysis [2]. To merge panchromatic (with a higher spatial resolution) and multispectral (with a lower spatial resolution) images from the same or separate sensors, pansharpening algorithms have been extensively studied in the past two decades [3], [4]. In this paper, we focus on the fusion of images from two categories of sensors: one category has low spatial resolution and high spectral resolution (hereinafter abbreviated as LSHS), such as Aqua (or Terra)/MODIS and EO-1/Hyperion; while the other category has high spatial resolution but low spectral resolution (hereinafter abbreviated as HSLS), the data from which are usually termed as multispectral images, such as Landsat/ETM+ and SPOT5/HRG. The low resolution or high resolution herein is in a relative sense. Through integrating the spectral information of LSHS data and the spatial information of HSLS data, we expect to extend the applications of available satellite images, thereby meeting various demands of data users.

To address this spatial and spectral fusion problem, one category of classic method is spatial-unmixing-based algorithms [5], [6], [7], [8], [9]. The processing steps of these methods are: (1) geometrical co-registration between the HSLS and LSHS images; (2) the multispectral classification on the high spatial resolution image (HSLS) to unmix the low spatial resolution image (LSHS); and (3) determination of the class spectra via regularized linear unmixing. These methods showed good performance at fusing Landsat images with MERIS [6], [7], [9] images or with ASTER [8] images for land applications. However, it should be noted that whether pure spectra exist resides in the sensor’s spatial resolution of the given HSLS data. These methods also have high demanding on the geometric registration accuracy of the given data (e.g., less than 0.1–0.2 of the low resolution pixel size according to Ref. [5]). The authors in [10], [11], [12] proposed to fuse multispectral and hyperspectral images in a maximum a posteriori (MAP) estimation framework by establishing an observation model between the desired image and the known image. The method reported in [10] employed a spatially varying statistical model to help exploit the correlations between multispectral and hyperspectral images. The methods developed in [11], [12] made use of a stochastic mixing model of the underlying scene content to enhance the spatial resolution of hyperspectral image. The authors in [13] proposed to improve the spatial resolution of hyperspectral image by fusing with high resolution panchromatic band images based on super-resolution technique. Firstly, this method learns the spatial information from panchromatic images by sparse representation; then the high-resolution hyperspectral image is constructed based on the learned spatial structures with a spectral regularization.

Due to the low spatial resolution or the existence of homogenous mixtures in the hyperspectral images, unmixing techniques are needed to decompose the mixed pixels observed into a set of constituents and the corresponding fractional coefficients, which denote the proportion of each constituent [14]. The first step in this spectral unmixing task is to collect a suitable set of endmembers to model the spectra of measured pixels by weighting these spectral signatures. There are two categories of endmembers according to different extraction methods: image endmember derived directly from the images and library endmember derived from known target materials in field or laboratory spectra [15]. However, employing library endmembers is risky because it is difficult to ensure that these spectra are captured under the same physical conditions as the observed data. Whereas image endmembers can avoid this problem due to the collection at the same scale as the observed data, thereby being linked to the scene features more easily [14]. A number of endmember extraction algorithms for hyperspectral data were quantitatively compared in [15]. In this paper, we employ this similar endmember extraction strategy for both HSLS and LSHS data and term the extracted spectral bases as dictionaries. For the second step of abundances estimation in hyperspectral unmixing, a popular and effective method is sparse unmixing [16]. The basic principle of this method is based on the observation that only a small portion of endmembers participated in the formation of each mixed pixel. Therefore, with the sparsity-inducing regularizers, i.e., the constraint of a few non-zero components for the abundance vectors, the abundances can be estimated by calling for linear sparse regression techniques [16]. In this paper, we adopt this similar sparse regularization when solving the representation coefficients of HSLS and LSHS data with respect to their representation atoms (or dictionaries).

In this paper, we seek to extract a dictionary-pair for representing LSHS and HSLS data, respectively. Specifically, the representation atoms of LSHS and HSLS data are firstly extracted from the given images, respectively, and then to form the dictionary-pair. Each representation atom of the dictionary-pair herein is in correspondence. Accordingly, each pixel spectra of LSHS and HSLS data can be expressed as a linear combination of their corresponding dictionary atoms, which have the same functions as endmembers in hyperspectral unmixing. Based on this dictionary-pair, the proposed spatio-spectral fusion algorithm consists of two stages: in the first stage, the good spectral properties of the LSHS image are employed to extract the basis functions of spectra (representation atoms) and further to form the dictionary-pair by enforcing the same representation coefficients of the HSLS and the LSHS images with respect to their dictionaries; in the second stage, the good spatial properties of the HSLS image are utilized to derive the representation coefficients with respect to its dictionary. The representation coefficients herein have similar functions as abundances in hyperspectral unmixing but provide spatial location properties due to the high spatial resolution of the HSLS image. Finally, the desired high spatial and high spectral resolution image can be obtained by the multiplication of representation atoms for the LSHS image (i.e., the LSHS dictionary) and representation coefficients for the HSLS image.

The following section presents the theoretical basis of this paper. Section 3 describes the proposed method for the fusion of HSLS and LSHS data. Section 4 shows the experimental validation of the proposed algorithm through comparison with two representative algorithms on both simulated and actual satellite datasets. Finally, we conclude this paper with a discussion on the application of the proposed method and remarks about its inherent features.

Section snippets

Theoretical basis

As introduced in Section 1, a dictionary-pair needs to be trained from the HSLS and LSHS data. Hence, the basic principles of dictionary-pair learning will first be introduced. Taking the non-negative properties of the bases spectra and fractional abundances of HSLS and LSHS data into account, we learn the required dictionary-pair by using sparse non-negative matrix factorization method, which will be presented in the second part of this section.

Proposed methodology

Based on the theoretical bases introduced in Section 2, this section presents the proposed method based on dictionary-pair learning and sparse NMF. Given two satellite datasets with LSHS and HSLS, respectively, the purpose of this paper is to combine their complementary information to obtain dataset with the spectral information of LSHS dataset and the spatial information of HSLS dataset (we abbreviate hereafter the desired dataset with high spatial resolution and high spectral resolution as

Experimental results and comparisons

In this section, we apply the proposed algorithm to both simulated data and actual satellite data and compare it with the spatial unmixing method proposed in [7] and the sparse representation with spectral regularization method proposed in [13]. For the actual satellite images, we take the Landsat-7/ETM+ reflectance as the HSLS image and Aqua/MODIS reflectance as the LSHS image in their VIR (visual and infrared) spectrum to obtain the fused image, which is characterized by the spatial

Conclusion

We proposed a spatial and spectral fusion method based on dictionary-pair learning. This method is devised for the fusion of two categories of remote sensing data: one category possesses coarse spatial details, wide spectrum coverage and more spectral bands, termed as data with low spatial resolution and high spectral resolution (LSHS); while the other category data is characterized by fine spatial details, narrow spectrum coverage and less spectral bands, termed as data with high spatial

References (36)

  • D.J. Weydahl et al.

    Comparison of RADARSAT-1 and IKONOS satellite images for urban features detection

    Inform. Fusion

    (2005)
  • D. Landgrebe

    Hyperspectral image data analysis

    IEEE Signal Process. Mag.

    (2002)
  • B. Aiazzi et al.

    A comparison between global and context-adaptive pansharpening of multispectral images

    IEEE Geosci. Remote Sens. Lett.

    (2009)
  • I. Amro et al.

    A survey of classical methods and new trends in pansharpening of multispectral images

    EURASIP J. Adv. Signal Process.

    (2011)
  • B. Zhukov et al.

    Unmixing-based multisensor multiresolution image fusion

    IEEE Trans. Geosci. Remote Sens.

    (1999)
  • A. Minghelli-Roman et al.

    Spatial resolution improvement of MeRIS images by fusion with TM images

    IEEE Trans. Geosci. Remote Sens.

    (2001)
  • R. Zurita-Milla et al.

    Unmixing-based Landsat TM and MERIS FR data fusion

    IEEE Geosci. Remote Sens. Lett.

    (2008)
  • N. Mezned et al.

    A comparative study for unmixing based Landsat ETM + and ASTER image fusion

    Int. J. Appl. Earth Obs. Geoinf.

    (2010)
  • J. Amorós-López et al.

    Regularized multiresolution spatial unmixing for ENVISAT/MERIS and landsat/TM image fusion

    IEEE Geosci. Remote Sens. Lett.

    (2011)
  • R.C. Hardie et al.

    MAP estimation for hyperspectral image resolution enhancement using an auxiliary sensor

    IEEE Trans. Image Process.

    (2004)
  • M.T. Eismann et al.

    Application of the stochastic mixing model to hyperspectral resolution enhancement

    IEEE Trans. Geosci. Remote Sens.

    (2004)
  • M.T. Eismann et al.

    Hyperspectral Resolution Enhancement Using high-resolution multispectral imagery with arbitrary response functions

    IEEE Trans. Geosci. Remote Sens.

    (2005)
  • Y. Zhao et al.

    Hyperspectral imagery super-resolution by sparse representation and spectral regularization

    EURASIP J. Adv. Signal Process.

    (2011)
  • N. Keshava et al.

    Spectral unmixing

    IEEE Signal Proc. Mag.

    (2002)
  • A. Plaza et al.

    A quantitative and comparative analysis of endmember extraction algorithms from hyperspectral data

    IEEE Geosci. Remote Sens. Lett.

    (2004)
  • M.-D. Iordache et al.

    Sparse unmixing of hyperspectral data

    IEEE Trans. Geosci. Remote Sens.

    (2011)
  • I. Tošíc et al.

    Dictionary learning

    IEEE Signal Process. Mag.

    (2011)
  • M. Aharon et al.

    K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation

    IEEE Trans. Signal Process.

    (2006)
  • Cited by (40)

    • Sparse coding with morphology segmentation and multi-label fusion for hyperspectral image super-resolution

      2023, Computer Vision and Image Understanding
      Citation Excerpt :

      To obtain high spatial resolution HSIs, the super-resolution is a widely used technique, which can fuse spectral information of HSIs and spatial details of MSIs together. In the last decades, sparse representation (SR) based methods (Wei et al., 2015a; Song et al., 2014; Huang et al., 2014; Wei et al., 2015b; Fang et al., 2018; Akhtar et al., 2014; Xing et al., 2020a) have achieved great successes in HSI super-resolution. These SR-based methods usually learn a sparse dictionary from the input HSI or MSI data or both them.

    • Using artificial intelligence and data fusion for environmental monitoring: A review and future perspectives

      2022, Information Fusion
      Citation Excerpt :

      More specifically, DCNN was the main class of deep learning (DL) algorithms that has been widely investigated due to its salient characteristics and advantages in using convolution operations to extract image features. From another hand, data fusion methods have been largely employed in various research fields (e.g. smart systems [7], hydraulic [8], energy [9], wireless sensor networks [10], healthcare [11], etc.) to aggregate data from different sensors and hence attain better accuracy in extracting the pertinent features compared to using data from a single sensor/device [12]. In the same manner, for analyzing different kinds of RS images and coming up with more precise information, various data fusion have been adopted, e.g. spatio-temporal fusion [13], spectral fusion [6], multi-source data fusion [14] and pixel-level image fusion [15,16].

    • Semi-blind hyperspectral and multispectral image fusion based on a non-factorization model

      2022, Infrared Physics and Technology
      Citation Excerpt :

      Many models for hyperspectral and multispectral image fusion have been proposed. Most adopt the decomposition of the target HR-HSI into a product of factor matrices or tensors, such as the spectral-unmixing-based model [2,19,37,36,26,34], the matrix-factorization-based model [23,48,27,33,1], and the tensor-factorization-based model [49,28,4,44,13,25,11]. Among these, the spectral-unmixing-based and matrix-factorization-based models require estimating the spectral basis and coefficient matrices from the LR-HSI and HR-MSI.

    View all citing articles on Scopus
    View full text