Abstract
Semi-supervised classification is a hot topic in pattern recognition and machine learning. However, in presence of heavy noise and outliers, the unlabeled training data could be very challenging or even misleading for the semi-supervised classifier. In this paper, we propose a novel structure regularized self-paced learning method for semi-supervised classification problems, which can efficiently learn partially labeled training data sequentially from the simple to the complex ones. The proposed formulation consists of three components: a cost function defined by a mixture of losses, a functional complexity regularizer, and a self-paced regularizer; and the corresponding optimization algorithm involves three iterative steps: classifier updating, sample importance calculating, and pseudo-labeling. In the proposed method, the cost function for classifier updating and sample importance calculating is defined as a combination of the label fitting loss and manifold smoothness loss. Then, the importance of the pseudo-labeled and unlabeled samples is adaptively calculated by the novel cost. Unlabeled samples with high importance values are pseudo-labeled with their current predictions. In this way, labels are efficiently propagated from the labeled samples to the unlabeled ones in the robust self-paced manner. Experimental results on several benchmark data sets are provided to show the effectiveness of the proposed method.
Similar content being viewed by others
References
Gong C, Tao DC, Maybank SJ, Liu W, Kang GL, Yang J (2016) Multi-modal curriculum learning for semi-supervised image classification. IEEE Trans Image Process 25(7):3249–3260
Liu CL, Hsaio WH, Lee CH, Chang TS, Kuo TS (2016) Semi-supervised text classification with universum learning. IEEE Trans Cybern 46(2):462–473
Huang H, Feng HL (2012) Gene classification using parameter-free semi-supervised manifold learning. IEEE/ACM Trans Comput Biol Bioinform 9(3):818–827
Reitmaier T, Calma A, Sick B (2015) Transductive active learning—a new semi-supervised learning approach based on iteratively refined generative models to capture structure in data. Inform Sci 293:275–298
Fujino A, Ueda N, Saito K (2008) Semi-supervised learning for a hybrid generative/discriminative classifier based on the maximum entropy principle. IEEE Trans Pattern Anal Mach Intell 30(3):424–437
Maulik U, Chakraborty D (2011) A self-trained ensemble with semisupervised SVM: an application to pixel classification of remote sensing imagery. Pattern Recogn 44(3):615–623
Wu D, Shang MS, Luo X, Xu J, Yan HY, Deng WH, Wang GY (2017) Self-training semi-supervised classification based on density peaks of data. Neurocomputing. https://doi.org/10.1016/j.neucom.2017.05.072
Li M, Zhou ZH (2007) Learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern Part A 37(6):1088–1098
Xu YK, Qin L, Huang QM (2016) Coupling reranking and structured output SVM co-train for multitarget tracking. IEEE Trans Circuits Syst Video Technol 26(6):1084–1098
Chapelle O, Sindhwani V, Keerthi SS (2008) Optimization techniques for semi-supervised support vector machines. J Mach Learn Res 9:203–233
Lu ZW, Wang LW (2015) Noise-robust semi-supervised learning via fast sparse coding. Pattern Recogn 48(2):605–612
Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th international conference on machine learning (ICML2003)
Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Technical Report CMUCALD-02-107, Computer Science Department, Carnegie Mellon University
Zhou D, Bousquet O, Lal T, Weston J, Schökopf B (2014) Learning with local and global consistency. In: Proceedings of the neural information processing systems conference (NIPS 2004)
Belkin M, Sindhwani V, Niyogi P (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Zhao MY, Jiao LC, Feng J, Liu TY (2014) A simplified low rank and sparse graph for semi-supervised learning. Neurocomputing 140:84–96
Zhuang LS, Zhou ZH, Gao SH, Yin JW, Lin ZC, Ma Y (2017) Label information guided graph construction for semi-supervised learning. IEEE Trans Image Process 26(9):4182–4192
Chapelle O, Weston J, Schökopf B (2003) Cluster kernels for semisupervised learning. In: Proceedings of the neural information processing systems conference (NIPS2003), pp 585–592
Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge
Wang YY, Chen SC, Zhou ZH (2012) New semi-supervised classification method based on modified cluster assumption. IEEE Trans Neural Netw 23(5):689–702
Zhu X (2006) Semi-supervised learning literature survey. Technical Report 1530, Computer Science Department, University of Wisconsin
Kumar M, Packer B, Koller D (2010) Self-paced learning for latent variable models. In: Proceedings of the neural information processing systems conference (NIPS2010), pp 1189–1197
Meng DY, Zhao Q, Jiang L (2017) A theoretical understanding of self-paced learning. Inform Sci 414:319–328
Jiang L, Meng DY, Yu SI, Lan ZZ, Shan SG, Hauptmann A (2014) Self-paced learning with diversity. In: Proceedings of the neural information processing systems conference (NIPS2014)
Zhang DW, Meng DY, Han JW (2017) Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell 39(5):865–878
Lin L, Wang KZ, Meng DY, Zuo WM, Zhang L (2017) Active self-paced learning for cost-effective and progressive face identification. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2017.2652459
Supančič III J, Ramanan D (2013) Self-paced learning for long-term tracking. In: IEEE conference on computer vision and pattern recognition (CVPR2013), pp 1189–1197
Kumar M, Turki H, Preston D, Koller D (2011) Learning specific-class segmentation from diverse data. In: IEEE conference on computer vision and pattern recognition (CVPR2011), pp 1800–1807
Yu S et al (2014) Cmu-informedia@ trecvid 2014 multimedia event detection. In: TRECVID video retrieval evaluation workshop
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 20th international conference on machine learning (ICML2009)
Jiang L, Meng D, Mitamura T, Hauptmann A (2014) Easy samples first: self-paced reranking for zeroexample multimedia search. In: Proceedings of ACM multimedia
Zhao Q, Meng DY, Jiang L, Xie Q, Xu ZB, Hauptmann A (2015) Self-paced learning for matrix factorization. In: Proceedings of AAAI conference on artificial intelligence (AAAI2015)
Bazaraa M, Sherali H, Shetty C (1993) Nonlinear programming—theory and algorithms. Wiley, New York
Jiang L, Meng DY, Zhao Q, Shan SG, Hauptmann A (2015) Self-paced curriculum learning. In: Proceedings of AAAI conference on artificial intelligence (AAAI2015)
Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68(3):337–404
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
Zhao MB, Chow Tommy WS, Wu Z, Zhang Z, Li B (2015) Learning from normalized local and global discriminative information for semi-supervised regression and dimensionality reduction. Inform Sci 324:286–309
Zhao MB, Zhang Z, Chow Tommy WS, Li B (2014) A general soft label based Linear Discriminant Analysis for semi-supervised dimensionality reduction. Neural Netw 55:83–97
Zhao MB, Chow Tommy WS, Zhang Z, Li B (2015) Automatic image annotation via compact graph based semi-supervised learning. Knowl based Syst 76:148–165
Gross R, Baker S, Matthews I (2005) Generic vs. person specific active appearance models. Image Vis Comput 23(11):1080–1093
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142
Wang F, Zhang CS (2008) Label propagation through linear neighborhoods. IEEE Trans Knowl Data Eng 20(1):55–67
Zhang HJ, Chow Tommy WS, JonathanWu QM (2016) Organizing books and authors by multilayer SOM. IEEE Trans Neural Netw 27(12):2537–2550
Acknowledgements
This work was supported in part by the National Natural Science Foundation (NNSF) of China [Grant Numbers: 61503263, 61772373, 61772374], in part by the Zhejiang Provincial Natural Science Foundation [Grant Numbers: LY15F030011, LY17F030004], in part by the Project of science and technology plans of Wenzhou City [Grant Number: G20160002].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gu, N., Fan, P., Fan, M. et al. Structure regularized self-paced learning for robust semi-supervised pattern classification. Neural Comput & Applic 31, 6559–6574 (2019). https://doi.org/10.1007/s00521-018-3478-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3478-1