Semi-supervised learning with connectivity-driven convolutional neural networks
Graphical abstract
Introduction
Predicting and clustering samples are crucial tasks in several application domains. Such processes can be performed in different ways and be adopted effectively when the dataset is entirely labeled. Unfortunately, as the number of samples increases, the classifier may lack stability due to the limited amount of labeled samples since most of them are manually labeled, which is a time-consuming and prone to error task. Therefore, a question arises naturally: Can one improve the performance of a classifier using both labeled and unlabeled samples? That is the primary concern that drives techniques based on the semi-supervised learning framework.
Such approaches make use of a labeled set that is extended with unlabeled samples, usually in a more significant amount. Such approaches either perform classification using supervised methods for label propagation [4], [8], [25], [36] or take into account the spatial distribution of the entire training set, i.e., both labeled and unlabeled samples, for learning purposes [1], [19]. These methods include self-training, generative probabilistic models, co-training, graph-based models, and semi-supervised Support Vector Machines, among others.
Deep learning has attracted considerable attention in the past years [6], which corresponds to a relatively broad class of machine learning techniques that employ complex neural architectures to perform classification. Such approaches encode non-linear information through several layers that are hierarchical in nature, thus assimilating problems at different levels of abstraction. However, as argued by several researchers [7], [16], supervised or unsupervised data, working independently, may not be sufficient to provide good performances. Therefore, the use of unsupervised data can be beneficial to improve classification performance.
Several works have recently proposed methods that use unsupervised samples to improve learning in deep neural networks. In a nutshell, the network is pre-trained using unlabeled samples for later adjustment using labeled data [16], [33]. Lee [23] presented a semi-supervised learning method for deep neural networks that consists of the simultaneous learning of both unlabeled and labeled samples, where the former are pre-labeled using softmax outputs. Another approach for pseudo-labeling was discussed by Wu and Prasad [42], where semi-supervised learning was used for the classification of hyperspectral images with pseudo-labeled samples to train a deep recurrent convolutional network. The experimental results demonstrated that the proposed approach exceeds the most recent supervised and semi-supervised learning methods for the classification of hyperspectral images.
Wu et al. [41] proposed a weak semi-supervised deep learning approach for the annotation of multi-label images using CNNs, where the idea was to use images that were weakly or even not labeled to train a deep neural network. A weighted pairwise ranking loss was employed to cope with the weakly labeled images, while a triplet similarity loss was applied to harness unlabeled images. Gao et al. [14] also presented a semi-supervised algorithm using Convolutional Neural Networks (CNNs) but in the context of active learning, which is used to find the most representative unlabeled samples together with a new regularization term in the loss of function. Weston et al. [40] concentrated on the idea of combining an embedding-based regularizer with a supervised learner to perform semi-supervised learning, as used by such techniques as LapSVM [5].
However, we have observed that only recently the works attempted at considering the semi-supervised framework together with deep learning techniques. The approach proposed in this paper tries to fill this gap by considering semi-supervised learning techniques to enhance the performance of Convolutional Neural Networks. Specifically, we showed that our recent approach based on the Optimum-Path Forest classifier (OPF) [27], [30], [31] can outperform several other works in the literature since it considers the optimum connectivity between supervised and unsupervised samples.
In a nutshell, the proposed approach works as follows: first, all available training samples (labeled and unlabeled) are pseudo-labeled using the OPF for further using the entire training set to train a Convolutional Neural Network. The semi-supervised learning approach connects unlabeled and labeled samples as nodes of a minimum-spanning tree and partitions the tree into an optimum-path forest rooted at the labeled nodes. The adjacency relation is defined as the set of arcs of a Minimum-Spanning Tree (MST) of the complete graph, whose nodes are the labeled and unlabeled samples. We then simplify the choice of the forest roots to be all labeled samples and the classifier is created from a single execution of the OPF algorithm, on the topology of the MST. Therefore, labeled nodes will compete with each other, and the pseudo-labels assigned to each unlabeled sample will come from its most closely connected labeled node. Finally, the network is fine-tuned using only the training samples for which one knows the true labels, i.e., those samples that were already labeled since the beginning. Notice that we fine-tuned only the fully-connected layers since we assume the labeled samples are limited in quantity. Therefore, the pre-trained layers and their weights learned from the entire training set are kept the same.
Therefore, the main contribution of this paper is to propose a semi-supervised learning approach that can improve the effectiveness of Convolutional Neural Networks and considers the two main paradigms of semi-supervised learning jointly, i.e., the proposed method takes into account both the spatial distribution of labeled and unlabeled samples and also propagates pseudo-labels to the unlabeled samples. We showed the proposed approach can outperform some state-of-the-art semi-supervised learning algorithms.
The remainder of this paper is organized as follows. Section 2 discusses the Optimum-Path Forest methodology for semi-supervised learning, and Section 3 presents the proposed approach to improve the performance of Convolutional Neural Networks. The experimental results are presented in Sections 4 and 5, and Section 6 states conclusions and future works.
Section snippets
Optimum-path forest
For a given training set with labeled and unlabeled sample subsets, one can devise unsupervised classifiers from the latter [35], supervised classifiers from the former [27], [30], [31], and semi-supervised classifiers from both [1], [2], [3]. In all approaches, one or multiple sequences of the Optimum-Path Forest algorithm [12] can be executed for different choices of weighted graphs (Fig. 1a) and connectivity functions f, where and stand for the set of nodes and edges,
The proposed method
Deep neural networks have a considerable number of parameters used in the learning process, which requires a large number of supervised samples. Application examples vary from speech processing, automatic vowel classification, biometrics, well drilling monitoring, and medical image segmentation to object tracking [10], [15], [18], [26], [28], [29], [32]. Most of such applications face a limited amount of labeled samples, thus reflecting the quite expensive and time-consuming labeling task.
Experiments
In this section, we present the datasets, methodology, and discuss the experimental results.
Results
First, the classification results on the three datasets using only labeled samples (i.e., ) for the CNN training (named SUPMNIST, SUPCIFAR10 and SUPCOR) are presented. Notice that the very same architecture was used in all experiments, i.e., supervised and semi-supervised learning. The primary goal is to compare the performance of the proposed work by applying the semi-supervised methodology for pseudo-label propagation. Therefore, we expect to obtain a significant improvement compared to
Conclusion
In this work, we showed how one can improve supervised learning for deep architectures using a semi-supervised methodology. Our method makes use of the unlabeled data with pseudo-labels propagated by the OPFSEMImst semi-supervised method. Thus, we created a CNN(1) network with all labeled and unlabeled samples (with the pseudo-labels) and further applyed a fine-tuning on a new CNN(2) network using only labeled samples with the weights learned from CNN(1).
Experimental results showed that the
Declaration of Competing Interest
None.
Acknowledgments
The authors are grateful to Corumbá Concessões S.A. (ANEEL PD-2262-1602/2016), CNPq grants 427968/2018-6 and 307066/2017-7, as well as FAPESP grants 2013/07375-0, 2014/12236-1, 2015/25739-4, and 2016/19403-6. This material is based upon work supported in part by funds provided by Intel AI Academy program under Fundunesp Grant No.2597.2017.
References (43)
- et al.
Multi-label semi-supervised classification through optimum-path forest
Inf. Sci.
(2018) - et al.
Improving semi-supervised learning through optimum connectivity
Pattern Recognit.
(2016) - et al.
A novel active semisupervised convolutional neural network algorithm for SAR image recognition
Comput. Intell. Neurosci.
(2017) - et al.
Petroleum well drilling monitoring through cutting image analysis and artificial intelligence techniques
Eng. Appl. Artif.Intell.
(2011) - et al.
Spoken emotion recognition through optimum-path forest classification using glottal features
Comput. Speech Lang.
(2010) - et al.
Efficient supervised optimum-path forest classification for large datasets
Pattern Recognit.
(2012) - et al.
Optimum-path forest based on k-connectivity: theory and applications
Pattern Recognit. Lett.
(2017) - et al.
Semi-supervised pattern classification using optimum-path forest
2014 27th SIBGRAPI Conference on Graphics, Patterns and Images
(2014) - et al.
Semi-supervised clustering by seeding
Proceedings of the Nineteenth International Conference on Machine Learning
(2002) - et al.
Manifold regularization: a geometric framework for learning from labeled and unlabeled examples
J. Mach. Learn. Res.
(2006)