Abstract
Semi-supervised kernel learning methods have been received much more attention in the past few years. Traditional semi-supervised non-parametric kernel learning (NPKL) methods usually formulate the learning task as a semi-definite programming (SDP) problem, which is very time consuming. Although some fast semi-supervised NPKL methods have been proposed recently, they usually scale very poorly. Furthermore, many semi-supervised NPKL methods are developed based on the manifold assumption. But, such an assumption might be invalid when handling some high-dimensional and sparse data, which has severely negative effect on the performance of learning algorithms. In this paper, we propose a more efficient semi-supervised NPKL method, which can effectively learn a low-rank kernel matrix from must-link and cannot-link constraints. Specially, by virtue of the nonlinear embedding functions based on extreme learning machine (ELM), the proposed method has the ability of coping with data points that do not have a clear manifold structure in a low dimensional space. The proposed method is formulated as a trace ratio optimization problem, which is combined with dimensionality reduction in ELM feature space and aims to find optimal low-rank kernel matrices. The proposed optimization problem can be solved much more efficiently than SDP solvers. Extensive experiments have validated the superior performance of the proposed method compared to state-of-the-art semi-supervised kernel learning methods.
Similar content being viewed by others
Abbreviations
- \( {\mathbb{R}}^{\text{d}} \) :
-
The input d-dimensional Euclidean space
- n :
-
The number of total training data points
- c :
-
The number of classes that the samples belong to
- \( \varvec{X} \) :
-
\( \varvec{X} = \left[ {\varvec{x}_{1} , \ldots .,\varvec{x}_{n} } \right] \in {\mathbb{R}}^{d \times n} \) is the training data matrix
- \( \varvec{Y} \) :
-
\( \varvec{Y} = \left[ {\varvec{y}_{1} , \ldots ,\varvec{y}_{n} } \right]^{T} \in {\mathbb{B}}^{{n \times {\text{c}}}} \) is the 0–1 class assignment matrix. \( \varvec{y}_{i} \in {\mathbb{B}}^{c \times 1} \) is the lable vector of \( \varvec{x}_{i} \), and all components of \( \varvec{y}_{i} \) are \( 0 \)s except one being \( 1 \)
- \( \varvec \phi \) :
-
\( \varvec \phi \left( \varvec{x} \right) = \left( {\varvec{\psi}_{1} \left( {\varvec{x}_{1} } \right), \ldots ,\varvec{\psi}_{n} (\varvec{x}_{n} )} \right) \) is the transformed data to the kernel space
- \( k\left( {\varvec{x},\varvec{y}} \right) \) :
-
Kernel function of variables \( \varvec{x} \) and \( \varvec{y} \)
- \( \varvec{K}^{\varvec{'}} \) :
-
Kernel matrix \( \varvec{K}^{\varvec{'}} = \left[ {k^{'} \left( {\varvec{x}_{i} ,\varvec{x}_{j} } \right)} \right]_{n \times n} \) for nonlinear embedding
- \( \varvec{e}_{i} \) :
-
The ith column of the \( n \times n \) identity matrix
- tr(\( \varvec{A} \)):
-
The trace of the matrix \( \varvec{A} \), that is, the sum of the diagonal elements of the matrix \( \varvec{A} \)
References
Bucak SS, Jain AK (2014) Multiple kernel learning for visual object recognition: a review. IEEE Trans Pattern Anal Mach Intell 36(7):1
Zhang X, Mahoor MH (2015) Task-dependent multi-task multiple kernel learning for facial action unit detection. Pattern Recogn 51:187–196
Liang Z, Zhang L, Liu J (2015) A novel multiple kernel learning method based on the kullback–leibler divergence. Neural Process Lett 42(3):745–762
Liu B, Xia SX, Zhou Y (2013) Unsupervised non-parametric kernel learning algorithm. Knowl Based Syst 44(1):1–9
Hu EL, Kwok JT (2014) Scalable nonparametric low-rank kernel learning using block coordinate descent. IEEE Transact Neural Netw Learn Syst 26(9):1927–1938
Aiolli F, Donini M (2015) EasyMKL: a scalable multiple kernel learning algorithm. Neurocomputing 169:215–224
Cortes C, Kloft M, Mohri M (2013) Learning kernels using local rademacher complexity. Adv Neural Inf Process Syst (NIPS) 26:2760–2768
Anguita D, Ghio A, Oneto L, Ridella S (2014) Unlabeled patterns to tighten Rademacher complexity error bounds for kernel classifiers. Pattern Recogn Lett 37:210–219
Zhang K, Wang Q, Lan L, Sun Y, Marsic I (2014) Sparse semi-supervised learning on low-rank kernel. Neurocomputing 129(4):265–272
Meng J, Jung C, Shen Y, Jiao L, Liu J (2015) Adaptive constraint propagation for semi-supervised kernel matrix learning. Neural Process Lett 41(1):1–17
Gao H, Song S, Gupta JND et al (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Transact Cybern 44(12):1
Li F, Yang J, Wang J (2007) A transductive framework of distance metric learning by spectral dimensionality reduction. In: Proceedings of the 24th International Conference on Machine Learning (ICML), Corvallis, OR, USA, pp 513–520
Zhong S, Chen D, Xu Q et al (2013) Optimizing the Gaussian kernel function with the formulated kernel target alignment criterion for two-class pattern classification. Pattern Recogn 46(7):2045–2054
Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recogn 43:1320–1333
Mohsenzadeh Y, Sheikhzadeh H (2015) Gaussian kernel width optimization for sparse Bayesian learning. IEEE Transact Neural Netw Learn Syst 26(4):709–719
Nazarpour A, Adibi P (2015) Two-stage multiple kernel learning for supervised dimensionality reduction. Pattern Recogn 48(5):1854–1862
Lin Y-Y, Liu T-L, Fuh C-S (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33:1147–1160
Orabona F, Jie L, Caputo B (2012) Multi kernel learning with online-batch optimization. J Mach Learn Res 13:227–253
Chen C, Zhang J, He X et al (2012) Non-parametric kernel learning with robust pairwise constraints. Int J Mach Learn Cybernet 3(2):1–14
Jian M, Jung C, Shen Y et al (2015) Adaptive constraint propagation for semi-supervised kernel matrix learning. Neural Process Lett 41(1):107–123
Hoi SCH, Jin R, Lyu MR (2007) Learning nonparametric kernel matrices from pairwise constraints. In: Proceedings of the 24th International Conference on Machine Learning (ICML), New York, USA, pp 361–368
Li Z, Liu J, Tang X (2008) Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: Proceedings of the 25th International Conference on Machine Learning (ICML), pp 576–583
Zhuang J, Tsang IW, Hoi SCH (2011) A family of simple non-parametric kernel learning algorithms. J Mach Learn Res 12:1313–1347
Baghshah MS, Shouraki SB (2011) Learning low-rank kernel matrices for constrained clustering. Neurocomputing 74(12):2201–2211
Yeung DY, Chang H (2007) A kernel approach for semi-supervised metric learning. IEEE Trans Neural Netw 18(1):141–149
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Feiping N, Zinan Z, Tsang IW, Dong X, Changshui Z (2011) Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering. IEEE Trans Neural Netw 22(11):1796–1808
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multi-class classification. IEEE Trans Syst Man Cybern 42(2):513–529
Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196
Lu SX, Wang XZ, Zhang GQ, Zhou X (2015) Effective algorithms of the Moore-Penrose inverse matrices for extreme learning machine. Intell Data Anal 19(4):743–760
Ashfaq RAR, Wang XZ, Huang JZX, Abbas H, He YL (2016) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci. doi:10.1016/j.ins.2016.04.019 (in press)
He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364–365:222–240
You ZH, Lei YK, Zhu L, Xia JF, Wang B (2013) Prediction of protein–protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinform 14(Suppl 8):S10
Kulis B, Basu S, Dhillon I (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74(1):1–22
Jia Y, Nie F, Zhang C (2009) Trace ratio problem revisited. IEEE Trans Neural Netw 20(4):729–735
Liu M, Sun W, Liu B (2015) Multiple kernel dimensionality reduction via spectral regression and trace ratio maximization. Knowl Based Syst 83(1):159–169
Chen Weifu, Feng Guocan (2012) Spectral clustering: a semi-supervised approach. Neurocomputing 77(1):229–242
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61403394) and the Fundamental Research Funds for the Central Universities (No. 2014QNA46).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Liu, M., Liu, B., Zhang, C. et al. Semi-supervised low rank kernel learning algorithm via extreme learning machine. Int. J. Mach. Learn. & Cyber. 8, 1039–1052 (2017). https://doi.org/10.1007/s13042-016-0592-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0592-1