A batch-mode active learning framework by querying discriminative and representative samples for hyperspectral image classification☆
Introduction
Machine learning algorithms have become powerful tools for the extraction of information from data in the different fields of data mining, pattern recognition, computer version, as well as in remote sensing [1], [2], [3], [4], [5], [6], [7], [54], and advances in remote sensing technology have made hyperspectral data with hundreds of narrow contiguous bands available. Hyperspectral image (HSI) processing with machine learning methods has been widely studied in the past decade [8], [9], [10], [11], [12], [13], [14]. HSI classification is one of the important tasks used to extract environmental information from remote sensing images and has been an active field in current HIS processing [15], [16], [17], [18], [19], [20]. To fully utilize the information in remote sensing images, many different machine learning algorithms have been developed to classify the data [21], [22], [23]. Supervised classification is the main technique, which requires the availability of labeled samples for training the classifiers. Given a specific supervised classifier, the remote sensing images can be automatically classified. However, the supervised classifiers are highly dependent on the amount and quality of the training samples [24]. Therefore, to collect samples of a good quality (e.g., informative and non-redundant) is vital.
Manually selecting the region of interest in the HSI as the training samples is a common approach, but this procedure is very expensive in most real-world applications. As HSIs have very high dimensionality, it is more difficult to design classifiers using only a few labeled data points than with a multispectral image [11]. This paper is focused on HSI classification with a few labeled data points. Two popular machine learning approaches have been developed to solve this problem: semi-supervised learning and active learning (AL). Semi-supervised algorithms incorporate the unlabeled samples and the labeled samples to find a classifier with better boundaries [25], [26], [27]. An overview of the semi-supervised classification techniques can be found in [12]. In contrast, AL assumes that a primary classifier with a small amount of labeled samples exists. AL is based on iteration and can provide better classification results with a small number of unlabeled samples. The AL methods are conducted according to an iterative process. In each iteration, the most informative unlabeled samples are chosen for manual labeling. In this way, the unnecessary and redundant labeling of non-informative samples is avoided, greatly reducing the labeling cost and time. Moreover, AL allows one to reduce the computational complexity of the training phase. The batch-mode active learning method is expected to be more suitable for hyperspectral image classification, where a batch of unlabeled samples is queried at each iteration, which increases the speed of the sample selection and reduces the iterations [28].
The best result for batch-mode AL is to select the most informative batch of samples with as little redundancy as possible, so that they can provide the uncertain information to the classifier. At the same time, batch-mode AL can also increase the speed of the sample selection and reduce the iterations [29]. There are two main phases for querying the unlabeled samples: the uncertainty and the diversity [30], [31], [32]. The first phase is to query the most informative samples with the uncertainty criterion, but in the queried samples, some very similar samples may exist, so in these samples, just one sample query is enough; in this way, it is necessary to remove the redundancy in these samples. Meanwhile, in the second phase, the diversity criterion is used to reduce the redundancy in the samples which are queried in the first phase with the diversity criterion. There has been a large amount of research into the study of the uncertainty criterion, the conventional uncertainty criteria of batch-mode AL can be grouped into three fields: 1) query by committee, in which the uncertainty of an unlabeled sample is measured by the disagreement of several classifiers [33], [34], [35]; 2) the posterior probability based methods, where the posterior probability is used to measure the uncertainty of the candidates [36], [37]; and 3) the large margin heuristic based methods, where the uncertainty of the candidates is measured by the distance to the margin of the classifier, such as support vector machine (SVM) [38], [39].
However, in the current research, less attention is being paid to the diversity criterion, the diversity criteria are mainly the clustering algorithms, such as k-means [40] and its kernel version [41], which depend on the correctness of the convergence and are usually influenced by the initialization adequacy of the initialization [42]. Moreover, these algorithms have to be given the number of the clustering centers beforehand. Thus the queried data by such methods are not guaranteed to be i.i.d. sampled from the original data distribution, as they are selectively sampled based on the AL criterion [43]. At the same time, they do not fully use the label information, and divide the uncertainty and the diversity criteria into two steps. In fact, using either kind of criterion alone may not be sufficient to get the optimal results.
This paper proposes a new diversity criterion, extends the empirical risk minimization principle to the AL case and presents a novel AL framework. This framework adopts the maximum mean discrepancy (MMD) to measure the distribution difference and derives an empirical upper bound for the AL risk. By minimizing this upper bound, it approximately minimizes the true risk under the original data distribution. In the proposed framework, it attempts to query the unlabeled samples by both discriminative and representative information with one optimized formulation. Our goal is to query a subset of unlabeled samples which help minimize the discriminative and representative information. The contributions of this manuscript can be summarized as:
- (1)
In the proposed framework, the MMD is adopted, so that the queried samples are not only diverse, but also preserve the distribution of the original data. This strategy can rapidly reduce the empirical risk in the training data.
- (2)
With the discriminative and representative information in one optimal formulation, a trade-off is undertaken by a weight parameter, and the queried samples can contain both discriminative and representative information.
- (3)
The proposed method is suitable for multiple classes problem, and the number of queried samples is adaptive. Furthermore, only the most uncertain samples are selected in the preparation procedure, so the proposed method can be used to solve large-scale data.
The reminder of this paper is organized as follows. Section 2 presents the recent research into batch-mode AL in remote sensing image classification. Section 3 formulates the proposed batch-mode AL framework. Section 4 describes the experiments with two benchmark hyperspectral datasets—the Indian Pines and Washington DC datasets,—and presents the experimental results in comparison with the other state-of-the-art batch-mode AL methods. Finally, Section 5 summarizes the paper.
Section snippets
The framework of conventional batch-mode active learning
The conventional AL methods can be modeled as a quintuple [44], where F is a supervised classifier which is used to train the training dataset T. Q is the query function used to select the most informative unlabeled samples from a pool U of unlabeled samples. D is a supervisor that can correctly label a batch of the most informative samples queried by Q. AL is an iterative process, in which the supervisor D labels the most informative samples queried by the query function Q. For the
The proposed batch-mode active learning framework
In this paper, we combine the discriminative and representative information into one optimal formulation as the diversity criterion, and select MS and MCLU as the uncertainty criteria. In the proposed method, the number of queried samples is adaptive. Meanwhile, the queried samples have the same distribution as the original data, and the relationship between the queried samples is identical and independently distributed (i.i.d.).
In the conventional batch-mode AL methods, the query function is
Experiments and analysis
We used two benchmark HSI datasets in the experiments [51] and compared the results of the proposed method and the other state-of-the-art methods. According to the experimental results, we then analyzed the proposed method.
Conclusion
In this paper, we generalize the empirical risk minimization principle to the active learning setting and propose a novel active learning framework. By effectively combing the representative term and discriminative term, we query the samples which are expected to rapidly reduce the empirical risk, and preserve the original source distribution at the same time. This enables our method to achieve a consistent good performance during the whole active learning process. The superior performance of
Zengmao Wang received the B.S. degree in project of surveying and mapping from Central South University, Changsha, China, in 2013, and is currently pursuing M.S. degree at the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing (LIESMARS). His research interests include hyperspectral image processing and machine learning.
References (54)
- et al.
Target detection based on a dynamic subspace
Pattern Recognit.
(2014) - et al.
Recent advances in techniques for hyperspectral image processing
Remote Sens. Environ.
(2009) - et al.
Column-generation kernel nonlocal joint collaborative representation for hyperspectral image classification
ISPRS J. Photogramm. Remote Sens.
(2014) - et al.
An efficient semi-supervised classification approach for hyperspectral imagery
ISPRS J. Photogramm. Remote Sens.
(2014) - et al.
Semi-supervised multiview embedding for hyperspectral data classification
Neurocomputing
(2014) - et al.
Active learning with adaptive regularization
Pattern Recognit.
(2011) - et al.
Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding
Pattern Recognit
(2015) - et al.
Multiresolution imaging
IEEE Trans. Cybern.
(2014) - et al.
Alternatively constrained dictionary learning for image superresolution
IEEE Trans. Cybern.
(2014) - et al.
Discriminant locally linear embedding with high-order tensor data
IEEE Trans. Syst. Man Cybern. B Cybern.
(2008)
Manifold regularized multitask learning for semi-supervised multilabel image classification
IEEE Trans. Image Process.
Signal processing for hyperspectral image exploitation
IEEE Signal Process. Mag.
Remote sensing image subpixel mapping based on adaptive differential evolution
IEEE Trans. Syst. Man Cybern. B: Cybern.
A discriminative manifold learning based dimension reduction method for hyperspectral classification
Int. J. Fuzzy Syst.
Signal Theory Methods in Multispectral Remote Sensing
Learning with Labeled and Unlabeled Data
Spectral-spatial classification of hyperspectral data using loopy belief propagation and active learning
IEEE Trans. Geosci. Remote. Sens.
Hyperspectral image classification through bilayer graph-based learning
IEEE Trans. Image Process.
SVM active learning approach for image classification using spatial information
IEEE Geosci. Remote Sens.
Semi-supervised discriminative locally enhanced alignment for hyperspectral image classification
IEEE Trans. Geosci. Remote Sens.
Spectral-Spatial Classification of Hyperspectral Data Based on a Stochastic Minimum Spanning Forest Approach
IEEE Trans. Image Process.
Anomaly Detection and reconstruction from random projections
IEEE Trans. Image Process.
Generalized composite kernel framework for hyperspectral image classification
IEEE Trans. Geosci. Remote Sens.
A batch-mode active learning algorithm using region-partitioning diversity for SVM classifier
IEEE J. Sel. Top. Appl. Earth Obs.
A novel transductive SVM for semisupervised classification of remote-sensing images
IEEE Trans. Geosci. Remote Sens.
Cited by (36)
Integrating Machine Learning with Human Knowledge
2020, iScienceCitation Excerpt :Besides, there are many other variants, such as density or diversity methods (Settles and Craven, 2008; Yang et al., 2015), which consider the repressiveness (reflection on input distribution) of instances in uncertainty sampling, clustering-based approaches (Dasgupta and Hsu, 2008; Nguyen and Smeulders, 2004; Saito et al., 2015) which cluster unlabeled data and query the most representative instances of those clusters, and min-max framework (Hoi et al., 2009; Huang et al., 2010) which minimizes the maximum possible classification loss. More versatile methods include combining multiple criteria (Du et al., 2015; Wang et al., 2016; Yang and Loog, 2018), choosing strategies automatically (Baram et al., 2004; Ebert et al., 2012), and training models to control active learning (Bachman et al., 2018; Konyushkova et al., 2017; Pang et al., 2018). In addition to asking the oracle to label instances, queries may seek for more advanced domain knowledge.
Multi-label active learning based on submodular functions
2018, NeurocomputingCitation Excerpt :And for representativeness, the methods based on this do not fully use the label information. Recent researches showed that methods combining these two criteria can result in better performance [6,12,22]. Traditional supervised learning problems assume that one instance is associated with only one single label.
Active learning with confidence-based answers for crowdsourcing labeling tasks
2018, Knowledge-Based SystemsCitation Excerpt :Thus, it will be more efficient to use a batch method which selects multiple instances at a time. There have been many works which studied batch methods for traditional active learning [34–37]. We investigate two methods which can be applied to our setting: top-k method and clustering-based method.
A variance maximization criterion for active learning
2018, Pattern RecognitionCitation Excerpt :Clustering-based approaches [36,43,58] and variance minimization methods [18,32,33,61] are included in the representativeness group. There are also methods that try to combine the two criteria, such as min-max view active learning [17], density or diversity weighted methods [1,30,47,60,64] and multi-criteria fusion [7,52,54,56]. The framework of retraining-based active learning, which our method is also an instantiation of, was first proposed by Roy and Mccallum [40] to perform so-called expected error reduction (EER for short).
Collaborative learning for hyperspectral image classification
2018, NeurocomputingCitation Excerpt :In AL, the classifier is retrained with new training set, and it is promised to select two or more samples at each AL iteration. These AL query strategies, such as margin sampling (MS) [43], Multiclass level uncertainty (MCLU) [36], max entropy (ME) [15], breaking ties (BT) [44], and Kullback–Leiber divergence maximization (KL-Max) [45,46], consider only the uncertainty of unlabeled samples. And, the queried samples may be redundant to each other.
Zengmao Wang received the B.S. degree in project of surveying and mapping from Central South University, Changsha, China, in 2013, and is currently pursuing M.S. degree at the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing (LIESMARS). His research interests include hyperspectral image processing and machine learning.
Bo Du (M’10–SM’15) received the B.S. degree and the Ph.D. degree in Photogrammetry and Remote Sensing from State Key Lab of Information Engineering in Surveying, Mapping and Remote sensing, Wuhan University, Wuhan, China, in 2005, and in 2010, respectively. He is currently an associate professor with the School of Computer, Wuhan University, Wuhan, China. He has more than 40 research papers published in the IEEE Transactions on Geoscience and Remote Sensing (TGRS), IEEE Transactions on image processing (TIP), IEEE Journal of Selected Topics in Earth Observations and Applied Remote Sensing (JSTARS), and IEEE Geoscience and Remote Sensing Letters (GRSL), etc. His major research interests include pattern recognition, hyperspectral image processing, and signal processing. He is currently a senior member of IEEE. He received the best reviewer awards from IEEE GRSS for his service to IEEE Journal of Selected Topics in Earth Observations and Applied Remote Sensing (JSTARS) in 2011 and ACM rising star awards for his academic progress in 2015. He was the Session Chair for the 4th IEEE GRSS Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS). He also serves as a reviewer of 20 Science Citation Index (SCI) magazines including IEEE TGRS, TIP, JSTARS, and GRSL.
Lefei Zhang (S'11-M'14) received the B.S. and Ph.D. degrees from Wuhan University, Wuhan, China, in 2008 and 2013, respectively. From August 2013 to July 2015, he was with the School of Computer, Wuhan University, as a Postdoctoral Researcher, and he was a Visiting Scholar with the CAD & CG Lab, Zhejiang University in 2015. He is currently a lecturer with the School of Computer, Wuhan University, and also a Hong Kong Scholar with the Department of Computing, Hong Kong Polytechnic University, Hong Kong. His research interests include pattern recognition, image processing, and remote sensing. Dr. Zhang is a reviewer of more than twenty international journals, including the IEEE TIP, TNNLS, and TGRS.
Liangpei Zhang (M'06–SM'08) received the B.S. degree in physics from Hunan Normal University, Changsha, China, in 1982, the M.S. degree in optics from the Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China, in 1988, and the Ph.D. degree in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 1998. He is currently the head of the remote sensing division, state key laboratory of information engineering in surveying, mapping, and remote sensing (LIESMARS), Wuhan University. He is also a "Chang-Jiang Scholar" chair professor appointed by the ministry of education of China. He is currently a principal scientist for the China state key basic research project (2011–2016) appointed by the ministry of national science and technology of China to lead the remote sensing program in China. He has more than 450 research papers and five books. He is the holder of 15 patents. His research interests include hyperspectral remote sensing, high-resolution remote sensing, image processing, and artificial intelligence. Dr. Zhang is the founding chair of IEEE Geoscience and Remote Sensing Society (GRSS) Wuhan Chapter. He received the best reviewer awards from IEEE GRSS for his service to IEEE Journal of Selected Topics in Earth Observations and Applied Remote Sensing (JSTARS) in 2012 and IEEE Geoscience and Remote Sensing Letters (GRSL) in 2014. He was the General Chair for the 4th IEEE GRSS Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS) and the guest editor of JSTARS. His research teams won the top three prizes of the IEEE GRSS 2014 Data Fusion Contest, and his students have been selected as the winners or finalists of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS) student paper contest in recent years. Dr. Zhang is a Fellow of the Institution of Engineering and Technology (IET), executive member (board of governor) of the China national committee of international geosphere–biosphere programme, executive member of the China society of image and graphics, etc. He was a recipient of the 2010 best paper Boeing award and the 2013 best paper ERDAS award from the American society of photogrammetry and remote sensing (ASPRS). He regularly serves as a Co-chair of the series SPIE conferences on multispectral image processing and pattern recognition, conference on Asia remote sensing, and many other conferences. He edits several conference proceedings, issues, and geoinformatics symposiums. He also serves as an associate editor of the International Journal of Ambient Computing and Intelligence, International Journal of Image and Graphics, International Journal of Digital Multimedia Broadcasting, Journal of Geo-spatial Information Science, and Journal of Remote Sensing, and the guest editor of Journal of applied remote sensing and Journal of sensors. Dr. Zhang is currently serving as an associate editor of the IEEE Transactions on Geoscience and Remote Sensing.
- ☆
This work was supported in part by the National Basic Research Program of China (973Program) under Grant 2012CB719905, the National Natural Science Foundation of China under Grants 61471274, 61401317 and 41431175, and the Natural Science Foundation of Hubei Province under Grants 2014CFB193.