Elsevier

Neurocomputing

Volume 275, 31 January 2018, Pages 2512-2524
Neurocomputing

Collaborative learning for hyperspectral image classification

https://doi.org/10.1016/j.neucom.2017.11.035Get rights and content

Abstract

Recently, collaborative learning (CL) is introduced to combine active learning (AL) with semi-supervised learning (SSL), and solve the problem of limited training samples. In this paper, we proposed a novel CL framework for hyperspectral image classification, in which AL and SSL are collaboratively integrated using clustering (CLUC). CLUC attempts to obtain more diversity and higher confidence of additional training samples in both AL and SSL. Note that clustering methods, which are used separately to enhance AL or SSL, are utilized to integrate these two learning process in CLUC. First, all unlabeled samples are assigned into clusters. Based on the clustering result, clustering-based query (CBQ) for both AL and SSL, and CBQ-based pseudo-labeling (CBQPL) for SSL are designed for CLUC. Second, the most and secondary uncertain samples in each cluster are selected by CBQ for AL and SSL, respectively, to ensure their informativeness. Third, CBQPL assigns the selected secondary uncertain samples with the same label as the most uncertain one, which is manually-labeled in AL within the same cluster. CBQPL makes the confidence of pseudo-labeling rely on the clustering results. We evaluate the performance of CLUC on three real hyperspectral images. The performance of the proposed method is tested under different numbers of labeled samples and compared with several approaches. We can observe from the experimental results that CLUC have superiority in classification maps and objective metrics with limited training samples.

Introduction

Hyperspectral image (HSI) often consists of rich spectral-spatial information. HSI classification is one of the main challenges in the interpretation of remote sensing data, which has been widely used in various applications, such as precision agriculture [1], target detection [2], urban planning [3], and land-cover identification [4], [5]. Given a set of observations (i.e., pixel vectors in a hyperspectral image), the goal of classification is to assign a unique label to each pixel vector so that it is well-defined by a given class [6]. Various supervised machine learning algorithms have been widely used in HSI classification, e.g., multinomial logistic regression (MLR) [7], support vector machines (SVMs) [8], artificial neural networks(ANN) [9], multiple feature learning [10], [11], sparse representation [12], [13], and so on. However, HSI classification often suffers from the problem of limited training samples, and the manually-labeling is time-consuming and expensive. Recently, some advanced machine learning techniques, i.e., active learning (AL) [14], semi-supervised learning (SSL) [15], and spectral-spatial classification [16], [17], have been widely developed to solve this problem.

AL involves an iterative learning process of human-machine interaction. It starts with a small-size set of training set, and then iteratively selects the newly-added training samples from the unlabeled samples, which are subsequently labeled by human. The research of AL focuses on the query strategy to rank the most informative training samples for the current classifier [18], [19]. These strategies are grouped into three main types: committee-based [14], [20], [21], large margin-based [22], and posterior probability-based [23] query heuristics. In contrast to AL, SSL leverages the unlabeled data together with the initial training set for the training of the classifier. It attempts to exploit the structural information in the feature space to facilitate the learning process without additional human efforts. The SSL methods are grouped into the following categories: self-learning [24], [25], co-training [26], generative probabilistic models [27], semi-supervised SVMs [28], and graph-based SSL [29].

AL and SSL approaches work based on different mechanisms, but both aim to provide promising classification accuracy and alleviate the cost of human efforts. Thus, it is reasonable to combine the AL an SSL learning process. Recently, some methods combining AL and SSL have been developed and demonstrated promising performance [24], [30], [31], [32], [33]. Very recently, AL and SSL have been integrated as a collaborative learning process, in which the selected unlabeled samples are collaboratively labeled by the human experts and machines/classifiers. In [34], collaborative active and semi-supervised learning (CASSL) acquires more confidently labeled samples by integrating AL and SSL. In [35], a random-walker-based verification strategy separates the unlabeled samples into the low-confident and high-confident data sets, and manually-labeled and pseudo-labeled samples are used together to train the final classifier.

These collaborative learning methods present promising classification performances with limited manually-labeled samples. However, there still exist several aspects that may be improved. First, the precision of pseudo-labeling relies on the current classification results and the strategy to select the most confident samples. Second, for the SSL learning process, actually the most confident samples are selected and pseudo-labeled, which provides redundant information for the classifiers. Third, the manually-marked labels are just assigned to the selected unlabeled samples in AL, and we wish to make full use of these labels marked by human.

We note that, clustering has been extensively incorporated into AL or SSL separately to improve the performance of HSI classification [29], [31], [36], [37], [38], [39], [40], [41], [42]. Considering only image statistics, clustering algorithms automatically partition the image into homogeneous and coherent groups. In AL, the classifier is retrained with new training set, and it is promised to select two or more samples at each AL iteration. These AL query strategies, such as margin sampling (MS) [43], Multiclass level uncertainty (MCLU) [36], max entropy (ME) [15], breaking ties (BT) [44], and Kullback–Leiber divergence maximization (KL-Max) [45], [46], consider only the uncertainty of unlabeled samples. And, the queried samples may be redundant to each other. Therefore, many studies have introduced clustering methods into AL to reduce the redundancy by seeking the most dissimilarity among those ranked samples [36], [37], [38], [39]. Different from AL, the unlabeled data in SSL are leveraged with the initial labeled data to train the classifier. SSL attempts to exploit the structural information in feature space to facilitate the learning process without additional manual-labeling. In [31], [40], clustering approaches are utilized to improve/construct the SSL learning process. In [29], [41], cluster/manifold regularization is included in graph-based SSL methods according to the assumption of labeling smoothness.

In this paper, we propose a novel collaborative learning framework using clustering (CLUC), in which the clustering procedure plays the similar role as a `bridge' in integrating the AL and SSL learning process. First, all the unlabeled pixels of an HSI image are grouped into a set of clusters using superpixel algorithm. Recently, superpixel techniques have been developed a lot in computer vision [47], [48], [49]. And superpixel methods are combined with segmentation-based classifiers for HSI classification [50], [51], [52], [53], [54], [55]. Superpixel segmentation is virtually a clustering method considering not only feature similarity but also spatial adjacency. After superpixel segmentation, an image is segmented into non-overlapping regions/clusters. Then, clustering-based query (CBQ) heuristic is designed by combining the breaking BT query strategy with the superpixel clustering result. Second, the most and the secondary uncertain samples in each superpixel are actively selected by the CBQ strategy for the AL and SSL learning process, respectively, to ensure their informativeness for the classifier. Third, CBQ-based pseudo-label (CBQPL) assigns the secondary uncertain unlabeled samples (in SSL) with the same label as the manually-labeled sample (in AL) within the same superpixel. And, both the manually-labeled and CBQPL-labeled samples are used to train the classifier at each iteration of CLUC.

An obvious difference of CLUC from those CL frameworks is that it aims to find the most confident samples that have the same labels with the manually-labeled samples in the CL learning process, instead of pseudo-labeling the unlabeled samples using the current classification result. This will bring several advantages in the following aspects. First, the CBQ strategy selects the most uncertain samples for both AL and SSL, which can provide more information than those aforementioned CL learning framework. Second, the designed CBQPL strategy makes the accuracy of pseudo-labeling rely on the effectiveness of clustering method, instead of the performance of classifiers. In this paper, using effective superpixel algorithm ensures a promising CBQPL labeling process. Additionally, based on the superpixel segmentation, the CBQ heuristic considers the diversity of active sampling process in both feature and spatial domains. Therefore, CBQ promotes more uncertainty of queried samples in both AL and SSL learning processes. Moreover, the CBQ strategy within each cluster is still biased sampling, which partly promotes the accuracy of CBQPL. Mutinomial logistic regression (MLR) is used to model the posterior probability distributions at each CLUC iteration. The variable splitting and augmented Lagrangian (LORSAL) [56] algorithm, followed by a multi-level logistic (MLL) [15] regularization, i.e., the LMLL method, is adopted as the basic classifier in this paper. The performance of CLUC is tested on three real hyperspectral images. Experimental results demonstrate that the proposed CLUC framework can achieve competitive classification performance with limited manually-labeled training samples.

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 presents a detailed description of the proposed CLUC method. The experimental results of CLUC are given in Section 4. Section 5 concludes this paper and suggests some future work.

Section snippets

Related work

This section presents a brief review of the existing research about the application of clustering on the AL and SSL learning, and the combination of AL and SSL for HSI classification. For more details about the analysis on AL and SSL (and the comparison between these two algorithms), readers may refer to [57], [58].

The proposed CLUC method

In this work, the proposed CLUC method integrates the AL and SSL learning process using clustering, which attempts to obtain more diversity and higher confidence of the additional training samples in both AL and SSL. Superpixel is utilized to group the unlabeled data into clusters. Based on the clustering results and BT heuristic criterion, CBQ heuristic is used to query the most informative samples for both AL and SSL, and CBQPL strategy is used for SSL to find the most confident samples that

Experimental results

In this section, we evaluate the performance of the proposed CLUC method using three real hyperspectral image datasets: the AVIRIS Indian Pines image, the AVIRIS Salinas image, and the University of Pavia image acquired by the ROSIS sensor. Using the SPBT and CBQPL and LMLL [39] as active query heuristic and basic classifier, respectively, we term the proposed method as CLUC-SPBT-LMLL in this section. The experiment focuses on verifying the effectiveness of the proposed CLUC algorithm with

Conclusions

In this paper, we present a novel CLUC framework for hyperspectral images classification, which integrates AL and SSL collaboratively using clustering. The proposed CLUC method aims to find the most confident samples that have the same labels with the manually-labeled samples in the collaborative learning process, instead of pseudo-labeling the unlabeled samples using the current classification result. Moreover, the CBQ query heuristic promotes more sampling diversity in AL and SSL query

Acknowledgments

The authors would like to thank Dr. M.-Y. Liu and Dr. J. Li for providing the source codes of superpixel and HSI classification, respectively, on their websites (http://www.mingyuliu.net/ and http://www.lx.it.pt/∼jun/). This work was supported in part by the National Natural Science Foundation of China under Grants 61432014, U1605252, 61772402 and 61671339, in part by the National Key Research and Development Program of China under Grant 2016QY01W0200, in part by Key Industrial Innovation Chain

Chao Pan received the B.Sc. degree in software engineering and the M.Sc. degree in computer application technology from Xidian University, Xi'an, China, in 2009 and 2012, respectively. He is currently working toward the Ph.D. degree in intelligent information processing with the school of Electronic Engineering, Xidian University. His research interests include hyperspectral image analysis, image segmentation and signal processing.

References (71)

  • Y. Tarabalka et al.

    Segmentation and classification of hyperspectral images using watershed transformation

    Pattern Recognit.

    (2010)
  • W. Heldens et al.

    Can the future EnMAP mission contribute to urban applications? a literature survey

    Remote Sens.

    (2011)
  • M.L. Clark et al.

    Species-level differences in hyperspectral metrics among tropical rainforest trees as determined by a tree-based classifier

    Remote Sens.

    (2012)
  • J.M. Bioucas-Dias et al.

    Hyperspectral remote sensing data analysis and future challenges

    IEEE Geosci. Remote Sens. Mag.

    (2013)
  • LiJ. et al.

    Semisupervised hyperspectral image classification using soft sparse multinomial logistic regression

    IEEE Geosci. Remote Sens. Lett.

    (2013)
  • F. Melgani et al.

    Classification of hyperspectral remote sensing images with support vector machines

    IEEE Trans. Geosci. Remote Sens.

    (2014)
  • ChenY. et al.

    Deep feature extraction and classification of hyperspectral images based on convolutional neural networks

    IEEE Trans. Geosci. Remote Sens.

    (2016)
  • LiJ. et al.

    Multiple feature learning for hyperspectral image classification

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • ZhaoC. et al.

    Efficient multiple-feature learning-based hyperspectral image classification with limited training samples

    IEEE Trans. Geosci. Remote Sens.

    (2016)
  • ChenY. et al.

    Hyperspectral image classification using dictionary-based sparse representation

    IEEE Trans. Geosci. Remote Sens.

    (2011)
  • ZhangE. et al.

    Weighted multifeature hyperspectral image classification via kernel joint sparse representation

    Neurocomputing

    (2015)
  • D. Tuia et al.

    Active learning methods for remote sensing image classification

    IEEE Trans. Geosci. Remote Sens.

    (2009)
  • LiJ. et al.

    Semi-supervised hyperspectral image segmentation using multinomial logistic regression with active learning

    IEEE Trans. Geosci. Remote Sens.

    (2010)
  • LiJ. et al.

    Spectral-spatial hyperspectral image segmentation using subspace multinomial logistic regression and markov random fields

    IEEE Trans. Geosci. Remote Sens.

    (2012)
  • D. Tuia et al.

    A survey of active learning algorithms for supervised remote sensing image classification

    IEEE J. Sel. Topics Signal Process.

    (2011)
  • M.M. Crawford et al.

    Active learning: any value for classification of remotely sensed data?

    Proc. IEEE

    (2013)
  • DiW. et al.

    Active learning via multi-view and local proximity co-regularization for hyperspectral image classification

    IEEE J. Sel. Topics Signal Process.

    (2011)
  • DiW. et al.

    View generation for multiview maximum disagreement based active learning for hyperspectral image classification

    IEEE Trans. Geosci. Remote Sens.

    (2012)
  • I. Dopido et al.

    Semisupervised self-learning for hyperspectral image classification

    IEEE Trans. Geosci. Remote Sens.

    (2013)
  • LuX. et al.

    Incorporating diversity into self-learning for synergetic classification of hyperspectral and panchromatic images

    Remote Sens.

    (2016)
  • A. Blum et al.

    Combining labeled and unlabeled data with co-training

  • Q. Jackson et al.

    An adaptive method for combined covariance estimation and classification

    IEEE Trans. Geosci. Remote Sens.

    (2002)
  • ChiM. et al.

    Semisupervised classification of hyperspectral images by SVMs optimized in the primal

    IEEE Trans. Geosci. Remote Sens.

    (2007)
  • L. Gomez-Chova et al.

    Semisupervised image classification with Laplacian support vector machines

    IEEE Geosci. Remote Sens. Lett.

    (2008)
  • D. Tuia et al.

    Large scale semi-supervised image segmentation with active queries

  • Cited by (14)

    • Hyperspectral imagery classification based on semi-supervised 3-D deep neural network and adaptive band selection

      2019, Expert Systems with Applications
      Citation Excerpt :

      Therefore, hyperspectral images (HSIs) have been widely used in environmental management (Pan, Shi, An, Jiang, & Ma, 2017; Yousefi, Castanedo, Bédard, Beaudoin, & Maldague, 2018), spectral unmixing (Jiang, Gong, Li, Zhang, & Li, 2018; Lu, Wu, Yuan, Yan, & Li, 2013; Xu & Shi, 2017), anomaly detection (Matteoli, Diani, & Corsini, 2010), and many other applications. In fact, for all these applications, the identification of the class of each pixel in HSI is required (Jamshidpour, Safari, & Homayouni, 2017; Pan, Li, Wang, & Gao, 2018; Shi & Pun, 2018; Wu, Zhu, Huang, & Li, 2016). However, HSI classification still raises the following issues (Chen, Jiao, et al., 2017; Wen et al., 2016): (i) the high correlation between the spectral bands; (ii) the spatial variability of different spectral signatures; and (iii) the large number of spectral bands producing the Hughes phenomenon, i.e., the classification performance decreases when the number of bands is very high whereas the number of training samples is very limited (Feng, Jiao, Zhang, & Sun, 2014).

    • Hyperspectral pansharpening via improved PCA approach and optimal weighted fusion strategy

      2018, Neurocomputing
      Citation Excerpt :

      Hyperspectral pansharpening is able to obtain a fused HS image by combining the spectral information of the HS image and the spatial information of the PAN image. The fused HS image has high spectral and spatial resolution, and can improve the accuracy of hyperspectral classification [12,13] and detection [14]. Therefore, it is necessary to fuse the HS and PAN images to generate the HS image with high spectral and spatial resolution.

    • Mutually exclusive-KSVD: Learning a discriminative dictionary for hyperspectral image classification

      2018, Neurocomputing
      Citation Excerpt :

      Li et al. [28] built a bridge between AdaBoost framework and extreme learning machine combined with composite kernel (weighted-CKELM) which was the first-time application in imbalanced datasets in HSIs classification. Pan et al. [29] proposed a novel collaborative learning (CL) framework, in which active learning (AL) and semi-supervised learning (SSL) are collaboratively integrated using clustering (CLUC), and solve the problem of limited training samples. The second type focuses on the data level, including feature dimensionality reduction and data expansion.

    • Fundus Image Screening for Diabetic Retinopathy

      2022, Zhongguo Jiguang/Chinese Journal of Lasers
    View all citing articles on Scopus

    Chao Pan received the B.Sc. degree in software engineering and the M.Sc. degree in computer application technology from Xidian University, Xi'an, China, in 2009 and 2012, respectively. He is currently working toward the Ph.D. degree in intelligent information processing with the school of Electronic Engineering, Xidian University. His research interests include hyperspectral image analysis, image segmentation and signal processing.

    Jie Li received the B.Sc. degree in electronic engineering, the M.Sc. degree in signal and information processing, and the Ph.D. degree in circuit and systems from Xidian University, Xi'an, China, in 1995, 1998, and 2004, respectively. She is currently a Professor with the School of Electronic Engineering, Xidian University. She is the author of around 50 technical articles in refereed journals and proceedings, including the IEEE TRANSACTIONS ON IMAGE PROCESSING, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, and Information Sciences. Her research interests include image processing and machine learning.

    Ying Wang received the B.Sc., M.Sc., and Doctoral degrees in signal and information processing from Xidian University, Xi'an, China, in 2003, 2006, and 2010, respectively. She is currently an Associate Professor of signal and information processing with Xidian University. Her research interests include medical image analysis, pattern recognition, and computer-aided diagnosis.

    Xinbo Gao (M'02-SM'07) received the B.Eng., M.Sc., and Ph.D. degrees in signal and information processing from Xidian University, Xi'an, China, in 1994, 1997, and 1999, respectively. From 1997 to 1998, he was a Research Fellow at the Department of Computer Science, Shizuoka University, Shizuoka, Japan. From 2000 to 2001, he was a Post-doctoral Research Fellow at the Department of Information Engineering, the Chinese University of Hong Kong, Hong Kong. Since 2001, he has been at the School of Electronic Engineering, Xidian University. He is currently a Cheung Kong Professor of Ministry of Education, a Professor of Pattern Recognition and Intelligent System, and the Director of the State Key Laboratory of Integrated Services Networks, Xi'an, China. His current research interests include multimedia analysis, computer vision, pattern recognition, machine learning, and wireless communications. He has published six books and around 200 technical articles in refereed journals and proceedings. Prof. Gao is on the Editorial Boards of several journals, including Signal Processing (Elsevier) and Neurocomputing (Elsevier). He served as the General Chair/Co-Chair, Program Committee Chair/Co-Chair, or PC Member for around 30 major international conferences. He is a Fellow of the Institute of Engineering and Technology and a Fellow of the Chinese Institute of Electronics.

    View full text