Spectral-spatial classification for hyperspectral image based on a single GRU
Introduction
Modern hyperspectral sensors can capture high spectral resolution data up to hundreds of bands, allowing for the distinction of very similar materials and objects. Rich spectral information offers great potential for classification [1], [2], [3], [4]. Hence, the analysis of hyperspectral imagery has attracted broad attention in remote sensing. Hyperspectral image (HSI) contains abundant spectral and spatial information, which has been widely applied in many fields such as agriculture, mining, environmental monitoring, land-cover mapping [5], [6], [7], [8], [9].
HSI classification aims to identify each pixel vector into a discrete set of specific classes. Many of the traditional approaches have concentrated on processing spectral features. Some of them exclusively employ the advantage of distinguishing the subtle spectral difference to determine its class belonging, such as random forest [10], [11], support vector machine (SVM) [12], [13], sparse representation models [14], [15], [16], [17]. However, these methods depend on manual features and due to their limitations, they cannot extract robust deep feature representations.
Unlike traditional classifiers, deep learning methods exploit high-level features which have the capability to acquire more complex structure representations [18], [19], [20], [21]. In particular, convolutional neural network (CNN) and recurrent neural network (RNN) have gained great success in a variety of computer vision tasks [22], [23], [24]. Taking CNN’s advantage of local connection and weight sharing properties, Wu et al. [25] directly deployed the spectral feature of the original image data as an input vector and utilized 1D-CNN for spectral HSI classification. But due to the kernel size limitations, 1D-CNN can only learn the local spectral dependency. A few works attempt to regard the spectral data as a sequence [26], [27], [28], [29], [30], naturally, RNN becomes a candidate model by its advantage of processing sequence data. Mou et al. [26] modeled the spectra of hyperspectral pixel as a 1D sequence vector for classification, and it was shown that GRU in RNN is a better choice for HSI classification, rather than long short-term memory (LSTM) cell. The spectra was input to the RNN which was formed of multiple GRUs, and each band is expanded and then delivered in the corresponding GRU, and the number of GRUs in the entire RNN network equals to the number of bands of hyperspectral data. It acquired competitive performance and showed the huge potential of deep recurrent networks for hyperspectral data analysis. Nevertheless, it only took spectral information and the entire network with hundreds of GRUs cost a heavy computation.
Many researchers also have developed various approaches considering spatial information [31], [32], [33], [34], [35], [36], [37]. For instance, Chen et al. [35] proposed a 3D-CNN network which directly learns spectral-spatial features over both spatial and spectral axes, but its computational complexity is dramatically increased. In addition, a combination of CNN and RNN has been developed for hyperspectral image analysis. Xu et al. [36] proposed a unified network with a bands-grouping based LSTM and MSCNN as the spectral and spatial feature extractors. Mei et al [37] proposed to concatenate a spatial attention CNN branch and a spectral attention bi-directional RNN branch to learn joint features.
So far, these deep learning methods mentioned above have yielded good results. Most of them are combinations of complex models, which can lead to heavy computation burdens, and the efficiency of these models can be easily ignored. Accordingly, aiming at the efficiency of computation, we design a simple and most effective method for the hyperspectral pixel-level classification.
Typically, as shown in the RNN unfolding structure(see the left part of Fig. 1), it can be seen that the number of internal units is related to the timestep, and the recurrent connection is between the hidden units corresponding to each timestep. Inspired by the RNN and its various variants [28], [36], [37], [38], the RNN component (such as a GRU or LSTM unit) corresponding to each timestep can be input to not only a single data but also a subsequence. Therefore, we design multiple comparison experiments by inputting sub-sequences of different lengths for timesteps. For the spectral vector of HSI data, we figure out using a single GRU, which means to input the entire spectral vector directly as one timestep, also can make full use of spectral information.
Unlike an RNN consisting of multiple GRUs, a single GRU does not carry the self-recurrent feature of RNN. From the characteristics of its own internal structure, GRU is a fully connected layer with a gate mechanism. The update gate and reset gate play a role to transform and select inputs. For the lengthy spectral vector in HSI data, this structure is tiny and effective in extracting spectral discriminative features. In addition, we also use the other input of GRU to capture the neighbor spatial information. The overall structure is shown in the right part of Fig. 1. The contribution of this work can be summarized as follows:
- •
We develop a tiny effective model for HSI spectral-spatial classification, which consists of only a single GRU. Our model acquires competitive performance with fully exploiting spectral and spatial features.
- •
We propose a tiny structure to extract spectral features. Taking the hyperspectral spectral vector as a 1D sequence, we utilize the RNN to extract spectral features. Instead of inputting each band as one timestep to the RNN which is formed of multiple GRUs, considering the high correlations between the reflected values of the neighboring bands and the integrated spectral profile, the whole spectrum data is input as one timestep into RNN which is formed of a single GRU. It greatly reduces the computation burden of the entire network.
- •
We design a novel way to fuse spatial information. As the initial state of RNN can be a trainable variable, in this work, we put spatial neighboring features as the initial state of the GRU to training. By this way, the spectral and spatial information are calculated and expanded in a single GRU. The experimental result shows that the proposed tiny network is indeed effective.
Section snippets
Preliminaries
RNN has received extensive concern in modeling sequence data. Unlike feed-forward neural networks, RNN is called recurrent because of its recurrent hidden state, whose activation at each step depends on the previous computations. RNN has a memory function, which can remember the information about what has been calculated so far.
The most commonly used type of RNN is the LSTM and GRU architectures, which are explicitly designed to deal with vanishing gradients and efficiently capture long-term
Extracting spectral feature
The data advantage of HSIs is spectral data that contain rich object features, and numerous of hyperspectral imagery studies focus on obtaining information from spectral data. In the hyperspectral data cube, spectral data consist of reflected values from hundreds of narrow and continuous spectral bands, which are one-dimensional ordered sequences.
For a hyperspectral pixel z, the kth spectral band is denoted as zk, and x(k) represents the input of kth time step in RNN. The kth GRU cell in RNN
Data description
We choose three public available HSI classification datasets to evaluate the performance of the proposed model, including Pavia Center dataset, Pavia University dataset and Indian Pines dataset.1
The Pavia Center dataset is gathered by reflective optics system imaging spectrometer ROSIS. This data set includes 102 spectral bands after removing 13 noisy channels with 1096 × 715 pixels, and it presents 9 classes
Conclusion
In this study, a tiny effective model is proposed to extract spectral-spatial features for hyperspectral image classification based on a single GRU. We utilize the superior of GRU that the initial state could be a trainable factor. Based on the similarity of neighbor pixels in the spatial domain, we can learn spatial contextual features in spatial dimensions by adding spatial neighbor information as the trainable initial state. Numerous inner spectral correlations in the continuous spectrum
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research was funded by the National Natural Science Foundation of China under Grant Nos. 61805181, 61705170, 61903279 and 61605146.
Erting Pan received the B.S. degree in electrical engineering and its automation from the Northeast Normal University, Changchun, China, in 2018. She is currently pursuing the M.S. degree with the Electronic Information School, Wuhan University, Wuhan. Her current research interests include hyperspectral imagery and deep learning.
References (41)
- et al.
Hyperspectral image denoising with superpixel segmentation and low-rank representation
Inf. Sci.
(2017) - et al.
Robust GBM hyperspectral image unmixing with superpixel segmentation based low rank and sparse representation
Neurocomputing
(2018) - et al.
Heathland conservation status mapping through integration of hyperspectral mixture analysis and decision tree classifiers
Remote Sens. Environ.
(2012) - et al.
Airborne hyperspectral remote sensing to assess spatial distribution of water quality characteristics in large rivers: the mississippi river and its tributaries in minnesota
Remote Sens. Environ.
(2013) - et al.
Mutually exclusive-KSVD: Learning a discriminative dictionary for hyperspectral image classification
Neurocomputing
(2018) - et al.
Fusiongan: a generative adversarial network for infrared and visible image fusion
Inf. Fusion
(2019) - et al.
Multiple convolutional layers fusion framework for hyperspectral image classification
Neurocomputing
(2019) - et al.
Convolutional neural networks for hyperspectral image classification
Neurocomputing
(2017) - et al.
Spectral–spatial classification of hyperspectral imagery with 3d convolutional neural network
Remote Sens.
(2017) - et al.
Hyperspectral image classification using spectral-spatial LSTMS
Neurocomputing
(2019)
Spatially regularized semisupervised ensembles of extreme learning machines for hyperspectral image segmentation
Neurocomputing
Hyperspectral remote sensing data analysis and future challenges
IEEE Geosci. Remote Sens. Mag.
Centroid and covariance alignment-based domain adaptation for unsupervised classification of remote sensing images
IEEE Trans. Geosci. Remote Sens.
Deep learning for remote sensing data: a technical tutorial on the state of the art
IEEE Geosci. Remote Sens. Mag.
Learning a transferable change rule from a recurrent neural network for land cover change detection
Remote Sens.
Investigation of the random forest framework for classification of hyperspectral data
IEEE Trans. Geosci. Remote Sens.
Hyperspectral image analysis by spectral–spatial processing and anticipative hybrid extreme rotation forest classification
IEEE Trans. Geosci. Remote Sens.
Classification of hyperspectral remote sensing images with support vector machines
IEEE Trans. Geosci. Remote Sens.
Kernel-based methods for hyperspectral image classification
IEEE Trans. Geosci. Remote Sens.
Multiple spectral–spatial classification approach for hyperspectral data
IEEE Trans. Geosci. Remote Sens.
Cited by (50)
Unleashing the full potential of hyperspectral imaging: Decoupled image and frequency-domain spatial–spectral framework
2024, Expert Systems with ApplicationsA research review on deep learning combined with hyperspectral Imaging in multiscale agricultural sensing
2024, Computers and Electronics in AgricultureLeakage source localisation employing 3D-CFD simulations and gated recurrent units
2023, Process Safety and Environmental ProtectionHarvesting the Landsat archive for land cover land use classification using deep neural networks: Comparison with traditional classifiers and multi-sensor benefits
2023, ISPRS Journal of Photogrammetry and Remote Sensing
Erting Pan received the B.S. degree in electrical engineering and its automation from the Northeast Normal University, Changchun, China, in 2018. She is currently pursuing the M.S. degree with the Electronic Information School, Wuhan University, Wuhan. Her current research interests include hyperspectral imagery and deep learning.
Xiaoguang Mei received the B.S. degree in communication engineering from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2007, the M.S. degree in communications and information systems from Huazhong Normal University, Wuhan, in 2011, and the Ph.D. degree in circuits and systems from the HUST, in 2016. From 2010 to 2012, he was a Software Engineer with the 722 Research Institute, China Shipbuilding Industry Corporation, Wuhan. From 2016 to 2019, He was a Post- Doctoral Fellow with the Electronic Information School, Wuhan University, Wuhan. He is currently an assistant professor with the Electronic Information School, Wuhan University. His research interests include hyperspectral imagery, machine learning, and pattern recognition.
Quande Wang received the B.S. degree in physics and M.S. degree in circuits and systems from the Central China Normal University (CCNU), Wuhan, China, in 1995 and 2000, respectively, and the Ph.D. degree in control science and engineering from the Wuhan University, in 2004. From 2005 to 2007, he was a PostDoctoral Fellow with the Electronic Information School, Wuhan University, Wuhan. Between 2014 and 2015, he was a Visiting Scholar at the University of Texas at Austin, Austin, USA. He is currently an associate professor with the Electronic Information School, Wuhan University. His research interests include computer vision, machine learning, and pattern recognition.
Yong Ma graduated from the Department of Automatic Control, Beijing Institute of Technology, Beijing, China, in 1997. He received the Ph.D. degree from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2003. His general field of research is in signal and systems. His current research projects include remote sensing of the Lidar and infrared, as well as Infrared image processing, pattern recognition, interface circuits to sensors and actuators. Between 2004 and 2006, he was a Lecturer at the University of the West of England, Bristol, U.K. Between 2006 and 2014, he was with the Wuhan National Laboratory for Optoelectronics, HUST, Wuhan, where he was a Professor of electronics. He is now a Professor with the Electronic Information School, Wuhan University.
Jiayi Ma received the B.S. degree in information and computing science and the Ph.D. degree in control science and engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2008 and 2014, respectively. From 2012 to 2013, he was an Exchange Student with the Department of Statistics, University of California at Los Angeles, Los Angeles, CA, USA. He was a Post-Doctoral with the Electronic Information School, Wuhan University from August 2014 to November 2015, and received an accelerated promotion to Associate Professor and Full Professor in December 2015 and December 2018, respectively. He has authored or coauthored more than 120 refereed journal and conference papers, including IEEE TPAMI/TIP/TSP/TNNLS/TIE/TGRS/TCYB/TMM/TCSVT, IJCV, CVPR, ICCV, IJCAI, AAAI, ICRA, IROS, ACM MM, etc. His research interests include computer vision, machine learning, and pattern recognition. Dr. Ma has been identified in the 2019 Highly Cited Researchers list from the Web of Science Group. He was a recipient of the Natural Science Award of Hubei Province (first class), the CAAI (Chinese Association for Artificial Intelligence) Excellent Doctoral Dissertation Award (a total of eight winners in China), and the CAA (Chinese Association of Automation) Excellent Doctoral Dissertation Award (a total of ten winners in China). He is an Editorial Board Member of Information Fusion and Neurocomputing, and a Guest Editor of Remote Sensing.