Prior class dissimilarity based linear neighborhood propagation
Introduction
During the last years, learning from both labeled and unlabeled samples, known as semi-supervised learning (SSL), has emerged as a booming direction in machine learning research. Detailed survey of its related literatures presented in [1].
As a major family of semi-supervised learning, graph-based methods have attracted more and more research and have been widely applied in many areas, such as text categorization [2], image retrieval [3], and image annotation [4], [5].
Encouraging results have been reported when samples have clearly intrinsic structure and the test data are well sampled, Nevertheless, as can be seen in following sections of this paper, these algorithms are not so powerful when confronted with different class overlapping and data distributed imbalance. They may cause the choice of the neighbors to be unreasonable and destroy label smoothness when constructing the graph in these approaches.
In this paper, we exploited the prior class information in the framework of graph-based semi-supervised learning and proposed a novel method named Class Dissimilarity based Linear Neighborhood Propagation (CD-LNP). Unfamiliar with traditional graph-based semi-supervised learning schemes which mentioned above, CD-LNP utilizes the class labels of the input data to guide the learning process. Thus, the interclass dissimilarity is definitely larger than intraclass dissimilarity, which is a superior property for classification.
The rest of this paper is organized as follows. In Section 2, we briefly introduced traditional graph-based semi-supervised learning schemes and analyzed their limitations; and the proposed CD-LNP strategy was detailed in Section 3. In Section 4, experiments are reported. Finally, in Section 5, conclusions are drawn and several issues for future work are indicated.
Section snippets
Related works
Graph-based schemes are typical approaches of semi-supervised learning [6], such as FAS (Frequent Approximate Subgraph) in [7] and DLP (Dynamic Label Propagation) in [8]. In these methods, labeled and unlabeled sample points are first organized as the nodes of a graph, of which the edge connecting two nodes directly has a weight proportional to the proximity of these two sample points. Then, labels are “propagated” along the weighted edges from labeled nodes to unlabeled ones, in order to get
The algorithm
Graph-based semi-supervised learning starts by constructing a graph from the training data. These algorithms often resort to KNN method when specifying the edge weights. Each vertex defines its k nearest neighbor vertices in Euclidean distance. Therefore, selecting precise neighbor is of great importance. However, in real application, there are always existing data regions with overlapping class and imbalance distribution. They may cause unreasonable choice of the neighbors and destroy label
Experiments
In this section, we provided a set of experiments, which we used CD-LNP for semi-supervised classification. To evaluate the performance of proposed method, we compare CD-LNP with four other popular graph-based semi-supervised learning methods, including Mincut, LGC, GRF and LNP.
Conclusion
In this paper, we presented a novel graph based semi-supervised classification approach, called Class Dissimilarity Linear Neighborhood Propagation. It is novel in the aspect of graph structure construction and weight estimation. This approach can be cast into the second-order intrinsic Gaussian Markov random field framework. It is equivalent to solving a biharmonic equation with Dirichlet boundary conditions. Experimental results demonstrate the effectiveness of proposed method.
We can conclude
Acknowledgement
Both authors would like to acknowledge the support of Key Laboratory of Electronic Restriction and National Natural Science Foundation (No. 61179036, No. 60872113).
References (15)
- et al.
Label propagation through sparse neighborhood and its applications
Neurocomputing
(2012) - X. Zhu, Semi-supervised Learning Literature Survey (Technical Report 1530), Computer Sciences, University of...
- et al.
Manifold adaptive experimental design for text categorization
IEEE Trans. Knowl. Data Eng.
(2012) - et al.
Hypergraph-based image retrieval for graph-based representation
Pattern Recogn.
(2012) - et al.
Web and personal image annotation by mining label correlation with relaxed visual graph embedding
IEEE Trans. Image Process.
(2012) - et al.
Graph-based semi-supervised learning with multi-modality propagation for large-scale image datasets
J. Vis. Commun. Image R.
(2013) - X.J. Zhu, A.B. Goldberg, T. Khot, Some new directions in graph-based semi-supervised learning, in: IEEE International...
Cited by (17)
Graph-based semi-supervised learning: A review
2020, NeurocomputingAdaptive non-negative projective semi-supervised learning for inductive classification
2018, Neural NetworksCitation Excerpt :The transductive learning methods aim to estimate the unknown labels of inside unlabeled data, but they cannot predict the unknown labels of outside unlabeled data. Several representative transductive LP learning algorithms consist of SSL using Gaussian Fields and Harmonic Functions (GFHF) (Zhu, Ghahramani, & Lafferty, 2003), Learning with Local and Global Consistency (LLGC) (Zhou, Bousquet, Lal, Weston, & Scholkopf, 2004), Linear Neighborhood Propagation (LNP) (Wang & Zhang, 2006), Special Label Propagation (SLP) (Nie, Xiang, & Liu, 2010), Projective Label Propagation (ProjLP) (Zhang, Jiang, & Li, 2015), Class Dissimilarity based LNP (CD-LNP) (Zhang, Wang, & Li, 2015), Robust Linear Neighborhood Propagation (R-LNP) (Jia, Zhang, & Jiang, 2016), and Sparse Neighborhood Propagation (SparseNP) (Zhang et al., 2015), etc. It is worth noting that several researchers have also incorporated the idea of semi-supervised label propagation learning into the Non-Negative Matrix Factorization (NMF) (Lee, 2001) and the Projective NMF (PNMF) frameworks (Yang & Oja, 2010), termed Semi-Supervised NMF (SSNMF) (Lee, Yoo, & Choi, 2010) and Semi-Supervised PNMF (Semi-PNMF) (Zhang, Guan, Jia, Qiu, & Luo, 2015).
Discriminative clustering on manifold for adaptive transductive classification
2017, Neural NetworksCitation Excerpt :That is, we mainly evaluate our algorithm by quantitative evaluation of image classification and visual observation of image segmentation. Note that the classification performance of our model is mainly compared with several related label propagation models, including SLP (Nie, Xiang et al., 2010), LNP (Wang & Zhang, 2008), LLGC (Zhou et al., 2004), LapLDA (Tang et al., 2006), GFHF (Zhu et al., 2003) and CD-LNP (Zhang et al., 2015). For fair comparison, all experiments are repeated 20 times and the averaged results are illustrated for each method to avoid the bias.
Projective label propagation by label embedding: A deep label prediction framework for representation and classification
2017, Knowledge-Based SystemsCitation Excerpt :Next, we will briefly review the several popular transductive LP criteria and out-of-sample extensions, which are related to our formulations. Several researchers have also proposed two-stage approaches based on LP, i.e., using an independent follow-up step and employing the outputted soft labels to construct the soft scatter matrices for semi-supervised discriminant analysis based image classification and retrieval, e.g., [4,8,28,29]. Thus, a dimension reduction based projection is delivered for embedding new data.
Robust and sparse label propagation for graph-based semi-supervised classification
2022, Applied IntelligenceRobust triple-matrix-recovery-based auto-weighted label propagation for classification
2020, IEEE Transactions on Neural Networks and Learning Systems