Elsevier

Neurocomputing

Volume 387, 28 April 2020, Pages 150-160
Neurocomputing

Spectral-spatial classification for hyperspectral image based on a single GRU

https://doi.org/10.1016/j.neucom.2020.01.029Get rights and content

Abstract

Deep learning methods have been successfully used to extract deep features of many hyperspectral tasks. Multiple neural networks have been introduced in the classification of hyperspectral images, such as convolutional neural network (CNN) and recurrent neural network (RNN). In this study, we offer a different perspective on addressing the hyperspectral pixel-level classification task. Most existing methods utilize complex models for this task, but the efficiency of these methods is often ignored. Based on this observation, we propose an effective tiny model for spectral-spatial classification on hyperspectral images based on a single gate recurrent unit (GRU). In our approach, the core GRU can learn spectral correlation within a whole spectrum input, and the spatial information can be fused as the initial hidden state of the GRU. By this way, spectral and spatial features are calculated and expanded together in a single GRU. By comparing the different utilization patterns of RNN with a variety of spatial information fusion methods, our approach demonstrates a competitive advantage in both accuracy and efficiency.

Introduction

Modern hyperspectral sensors can capture high spectral resolution data up to hundreds of bands, allowing for the distinction of very similar materials and objects. Rich spectral information offers great potential for classification [1], [2], [3], [4]. Hence, the analysis of hyperspectral imagery has attracted broad attention in remote sensing. Hyperspectral image (HSI) contains abundant spectral and spatial information, which has been widely applied in many fields such as agriculture, mining, environmental monitoring, land-cover mapping [5], [6], [7], [8], [9].

HSI classification aims to identify each pixel vector into a discrete set of specific classes. Many of the traditional approaches have concentrated on processing spectral features. Some of them exclusively employ the advantage of distinguishing the subtle spectral difference to determine its class belonging, such as random forest [10], [11], support vector machine (SVM) [12], [13], sparse representation models [14], [15], [16], [17]. However, these methods depend on manual features and due to their limitations, they cannot extract robust deep feature representations.

Unlike traditional classifiers, deep learning methods exploit high-level features which have the capability to acquire more complex structure representations [18], [19], [20], [21]. In particular, convolutional neural network (CNN) and recurrent neural network (RNN) have gained great success in a variety of computer vision tasks [22], [23], [24]. Taking CNN’s advantage of local connection and weight sharing properties, Wu et al. [25] directly deployed the spectral feature of the original image data as an input vector and utilized 1D-CNN for spectral HSI classification. But due to the kernel size limitations, 1D-CNN can only learn the local spectral dependency. A few works attempt to regard the spectral data as a sequence [26], [27], [28], [29], [30], naturally, RNN becomes a candidate model by its advantage of processing sequence data. Mou et al. [26] modeled the spectra of hyperspectral pixel as a 1D sequence vector for classification, and it was shown that GRU in RNN is a better choice for HSI classification, rather than long short-term memory (LSTM) cell. The spectra was input to the RNN which was formed of multiple GRUs, and each band is expanded and then delivered in the corresponding GRU, and the number of GRUs in the entire RNN network equals to the number of bands of hyperspectral data. It acquired competitive performance and showed the huge potential of deep recurrent networks for hyperspectral data analysis. Nevertheless, it only took spectral information and the entire network with hundreds of GRUs cost a heavy computation.

Many researchers also have developed various approaches considering spatial information [31], [32], [33], [34], [35], [36], [37]. For instance, Chen et al. [35] proposed a 3D-CNN network which directly learns spectral-spatial features over both spatial and spectral axes, but its computational complexity is dramatically increased. In addition, a combination of CNN and RNN has been developed for hyperspectral image analysis. Xu et al. [36] proposed a unified network with a bands-grouping based LSTM and MSCNN as the spectral and spatial feature extractors. Mei et al [37] proposed to concatenate a spatial attention CNN branch and a spectral attention bi-directional RNN branch to learn joint features.

So far, these deep learning methods mentioned above have yielded good results. Most of them are combinations of complex models, which can lead to heavy computation burdens, and the efficiency of these models can be easily ignored. Accordingly, aiming at the efficiency of computation, we design a simple and most effective method for the hyperspectral pixel-level classification.

Typically, as shown in the RNN unfolding structure(see the left part of Fig. 1), it can be seen that the number of internal units is related to the timestep, and the recurrent connection is between the hidden units corresponding to each timestep. Inspired by the RNN and its various variants [28], [36], [37], [38], the RNN component (such as a GRU or LSTM unit) corresponding to each timestep can be input to not only a single data but also a subsequence. Therefore, we design multiple comparison experiments by inputting sub-sequences of different lengths for timesteps. For the spectral vector of HSI data, we figure out using a single GRU, which means to input the entire spectral vector directly as one timestep, also can make full use of spectral information.

Unlike an RNN consisting of multiple GRUs, a single GRU does not carry the self-recurrent feature of RNN. From the characteristics of its own internal structure, GRU is a fully connected layer with a gate mechanism. The update gate and reset gate play a role to transform and select inputs. For the lengthy spectral vector in HSI data, this structure is tiny and effective in extracting spectral discriminative features. In addition, we also use the other input of GRU to capture the neighbor spatial information. The overall structure is shown in the right part of Fig. 1. The contribution of this work can be summarized as follows:

  • We develop a tiny effective model for HSI spectral-spatial classification, which consists of only a single GRU. Our model acquires competitive performance with fully exploiting spectral and spatial features.

  • We propose a tiny structure to extract spectral features. Taking the hyperspectral spectral vector as a 1D sequence, we utilize the RNN to extract spectral features. Instead of inputting each band as one timestep to the RNN which is formed of multiple GRUs, considering the high correlations between the reflected values of the neighboring bands and the integrated spectral profile, the whole spectrum data is input as one timestep into RNN which is formed of a single GRU. It greatly reduces the computation burden of the entire network.

  • We design a novel way to fuse spatial information. As the initial state of RNN can be a trainable variable, in this work, we put spatial neighboring features as the initial state of the GRU to training. By this way, the spectral and spatial information are calculated and expanded in a single GRU. The experimental result shows that the proposed tiny network is indeed effective.

Section snippets

Preliminaries

RNN has received extensive concern in modeling sequence data. Unlike feed-forward neural networks, RNN is called recurrent because of its recurrent hidden state, whose activation at each step depends on the previous computations. RNN has a memory function, which can remember the information about what has been calculated so far.

The most commonly used type of RNN is the LSTM and GRU architectures, which are explicitly designed to deal with vanishing gradients and efficiently capture long-term

Extracting spectral feature

The data advantage of HSIs is spectral data that contain rich object features, and numerous of hyperspectral imagery studies focus on obtaining information from spectral data. In the hyperspectral data cube, spectral data consist of reflected values from hundreds of narrow and continuous spectral bands, which are one-dimensional ordered sequences.

For a hyperspectral pixel z, the kth spectral band is denoted as zk, and x(k) represents the input of kth time step in RNN. The kth GRU cell in RNN

Data description

We choose three public available HSI classification datasets to evaluate the performance of the proposed model, including Pavia Center dataset, Pavia University dataset and Indian Pines dataset.1

The Pavia Center dataset is gathered by reflective optics system imaging spectrometer ROSIS. This data set includes 102 spectral bands after removing 13 noisy channels with 1096 × 715 pixels, and it presents 9 classes

Conclusion

In this study, a tiny effective model is proposed to extract spectral-spatial features for hyperspectral image classification based on a single GRU. We utilize the superior of GRU that the initial state could be a trainable factor. Based on the similarity of neighbor pixels in the spatial domain, we can learn spatial contextual features in spatial dimensions by adding spatial neighbor information as the trainable initial state. Numerous inner spectral correlations in the continuous spectrum

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research was funded by the National Natural Science Foundation of China under Grant Nos. 61805181, 61705170, 61903279 and 61605146.

Erting Pan received the B.S. degree in electrical engineering and its automation from the Northeast Normal University, Changchun, China, in 2018. She is currently pursuing the M.S. degree with the Electronic Information School, Wuhan University, Wuhan. Her current research interests include hyperspectral imagery and deep learning.

References (41)

  • B. Ayerdi et al.

    Spatially regularized semisupervised ensembles of extreme learning machines for hyperspectral image segmentation

    Neurocomputing

    (2015)
  • J.M. Bioucas-Dias et al.

    Hyperspectral remote sensing data analysis and future challenges

    IEEE Geosci. Remote Sens. Mag.

    (2013)
  • L. Ma et al.

    Centroid and covariance alignment-based domain adaptation for unsupervised classification of remote sensing images

    IEEE Trans. Geosci. Remote Sens.

    (2019)
  • L. Zhang et al.

    Deep learning for remote sensing data: a technical tutorial on the state of the art

    IEEE Geosci. Remote Sens. Mag.

    (2016)
  • H. Lyu et al.

    Learning a transferable change rule from a recurrent neural network for land cover change detection

    Remote Sens.

    (2016)
  • J. Ham et al.

    Investigation of the random forest framework for classification of hyperspectral data

    IEEE Trans. Geosci. Remote Sens.

    (2005)
  • B. Ayerdi et al.

    Hyperspectral image analysis by spectral–spatial processing and anticipative hybrid extreme rotation forest classification

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • F. Melgani et al.

    Classification of hyperspectral remote sensing images with support vector machines

    IEEE Trans. Geosci. Remote Sens.

    (2004)
  • G. Camps-Valls et al.

    Kernel-based methods for hyperspectral image classification

    IEEE Trans. Geosci. Remote Sens.

    (2005)
  • Y. Tarabalka et al.

    Multiple spectral–spatial classification approach for hyperspectral data

    IEEE Trans. Geosci. Remote Sens.

    (2010)
  • Cited by (50)

    View all citing articles on Scopus

    Erting Pan received the B.S. degree in electrical engineering and its automation from the Northeast Normal University, Changchun, China, in 2018. She is currently pursuing the M.S. degree with the Electronic Information School, Wuhan University, Wuhan. Her current research interests include hyperspectral imagery and deep learning.

    Xiaoguang Mei received the B.S. degree in communication engineering from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2007, the M.S. degree in communications and information systems from Huazhong Normal University, Wuhan, in 2011, and the Ph.D. degree in circuits and systems from the HUST, in 2016. From 2010 to 2012, he was a Software Engineer with the 722 Research Institute, China Shipbuilding Industry Corporation, Wuhan. From 2016 to 2019, He was a Post- Doctoral Fellow with the Electronic Information School, Wuhan University, Wuhan. He is currently an assistant professor with the Electronic Information School, Wuhan University. His research interests include hyperspectral imagery, machine learning, and pattern recognition.

    Quande Wang received the B.S. degree in physics and M.S. degree in circuits and systems from the Central China Normal University (CCNU), Wuhan, China, in 1995 and 2000, respectively, and the Ph.D. degree in control science and engineering from the Wuhan University, in 2004. From 2005 to 2007, he was a PostDoctoral Fellow with the Electronic Information School, Wuhan University, Wuhan. Between 2014 and 2015, he was a Visiting Scholar at the University of Texas at Austin, Austin, USA. He is currently an associate professor with the Electronic Information School, Wuhan University. His research interests include computer vision, machine learning, and pattern recognition.

    Yong Ma graduated from the Department of Automatic Control, Beijing Institute of Technology, Beijing, China, in 1997. He received the Ph.D. degree from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2003. His general field of research is in signal and systems. His current research projects include remote sensing of the Lidar and infrared, as well as Infrared image processing, pattern recognition, interface circuits to sensors and actuators. Between 2004 and 2006, he was a Lecturer at the University of the West of England, Bristol, U.K. Between 2006 and 2014, he was with the Wuhan National Laboratory for Optoelectronics, HUST, Wuhan, where he was a Professor of electronics. He is now a Professor with the Electronic Information School, Wuhan University.

    Jiayi Ma received the B.S. degree in information and computing science and the Ph.D. degree in control science and engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2008 and 2014, respectively. From 2012 to 2013, he was an Exchange Student with the Department of Statistics, University of California at Los Angeles, Los Angeles, CA, USA. He was a Post-Doctoral with the Electronic Information School, Wuhan University from August 2014 to November 2015, and received an accelerated promotion to Associate Professor and Full Professor in December 2015 and December 2018, respectively. He has authored or coauthored more than 120 refereed journal and conference papers, including IEEE TPAMI/TIP/TSP/TNNLS/TIE/TGRS/TCYB/TMM/TCSVT, IJCV, CVPR, ICCV, IJCAI, AAAI, ICRA, IROS, ACM MM, etc. His research interests include computer vision, machine learning, and pattern recognition. Dr. Ma has been identified in the 2019 Highly Cited Researchers list from the Web of Science Group. He was a recipient of the Natural Science Award of Hubei Province (first class), the CAAI (Chinese Association for Artificial Intelligence) Excellent Doctoral Dissertation Award (a total of eight winners in China), and the CAA (Chinese Association of Automation) Excellent Doctoral Dissertation Award (a total of ten winners in China). He is an Editorial Board Member of Information Fusion and Neurocomputing, and a Guest Editor of Remote Sensing.

    View full text