Skip to main content

Advertisement

Log in

A Novel Manifold Regularized Online Semi-supervised Learning Model

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In the process of human learning, training samples are often obtained successively. Therefore, many human learning tasks exhibit online and semi-supervision characteristics, that is, the observations arrive in sequence and the corresponding labels are presented very sporadically. In this paper, we propose a novel manifold regularized model in a reproducing kernel Hilbert space (RKHS) to solve the online semi-supervised learning (OS2L) problems. The proposed algorithm, named Model-Based Online Manifold Regularization (MOMR), is derived by solving a constrained optimization problem. Different from the stochastic gradient algorithm used for solving the online version of the primal problem of Laplacian support vector machine (LapSVM), the proposed algorithm can obtain an exact solution iteratively by solving its Lagrange dual problem. Meanwhile, to improve the computational efficiency, a fast algorithm is presented by introducing an approximate technique to compute the derivative of the manifold term in the proposed model. Furthermore, several buffering strategies are introduced to improve the scalability of the proposed algorithms and theoretical results show the reliability of the proposed algorithms. Finally, the proposed algorithms are experimentally shown to have a comparable performance to the standard batch manifold regularization algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Kivinen J, Smola AJ, Williamson RC. Online learning with kernels. IEEE Trans Sig Process. 2004;52(8):2165–76.

    Article  Google Scholar 

  2. Li GQ, Wen CY, Li ZG, Zhang A, Yang F, Mao K. Model-based online learning with kernels. IEEE Trans Neural Netw Learn Syst. 2013;24(3):356–69.

    Article  PubMed  Google Scholar 

  3. Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res. 2011;12:2121–59.

    Google Scholar 

  4. Huang KZ, Yang HQ, Lyu MR. Machine learning: modeling data locally and globally. Springer Science & Business Media. 2008.

  5. Orabona F, Keshet J, Caputo B. Bounded kernel-based online learning. J Mach Learn Res. 2009;10: 2643–66.

    Google Scholar 

  6. Ertekin S, Bottou L, Giles CL. Nonconvex online support vector machines. IEEE Trans Pattern Anal Mach Intell. 2011;33(2):368–81.

    Article  PubMed  Google Scholar 

  7. Hoi SC, Wang JL, Zhao PL. Libol: A library for online learning algorithms. J Mach Learn Res. 2014; 15(1):495–9.

    Google Scholar 

  8. Ding S, Zhang J, Jia H, Qian J. An adaptive density data stream clustering algorithm. Cogn Comput. 2016;8(1):30–8.

    Article  Google Scholar 

  9. Gepperth A, Karaoguz C. A bio-inspired incremental learning architecture for applied perceptual problems. Cogn Comput. 2016;8(5):924–34.

    Article  Google Scholar 

  10. Zhao J, Du C, Sun H, Liu X, Sun J. Biologically motivated model for outdoor scene classification. Cogn Comput. 2015;7(1):20– 33.

    Article  Google Scholar 

  11. Wang D, Qiao H, Zhang B, Wang M. Online support vector machine based on convex Hull vertices selection. IEEE Trans Neural Netw Learn Syst. 2013;24(4):593–609.

    Article  PubMed  Google Scholar 

  12. Ding SG, Nie XL, Qiao H, Zhang B. Online classification for SAR target recognition based on SVM and approximate convex hull vertices selection. In: 11th World Congress on intelligent control and automation (WCICA); 2014. p. 1473–1478.

  13. Wu PC, Hoi SC, Zhao PL, Xia H, Liu ZY, Miao CY. Online multi-modal distance metric learning with application to image retrieval. IEEE Trans Knowl Data Eng. 2016;28(2):454–67.

    Article  Google Scholar 

  14. Scardapane S, Uncini A. Semi-supervised echo state networks for audio classification. Cogn Comput. 2016;1–11.

  15. Zhang YM, Huang KZ, Geng GG, Liu CL. A fast and robust graph-based transductive learning method. IEEE Trans Neural Netw Learn Syst. 2015;26(9):1979–91.

    Article  PubMed  Google Scholar 

  16. Zhu XJ, Rogers T, Qian RC, Kalish C. Humans perform semi-supervised classification too. In: Proceedings of the national conference on artificial intelligence. vol. 22. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999; 2007. p. 864.

  17. Yang HQ, Huang KZ, King I, Lyu MR. Maximum margin semi-supervised learning with irrelevant data. Neural Netw. 2015;70 :90–102.

    Article  PubMed  Google Scholar 

  18. Gibson BR, Rogers TT, Zhu XJ. Human semi-supervised learning. Topics Cogn Sci. 2013;5(1):132–72.

    Article  Google Scholar 

  19. Babenko B, Yang MH, Belongie S. Visual tracking with online multiple instance learning. In: IEEE Conference on computer vision and pattern recognition; 2009. p. 983–990.

  20. Grabner H, Leistner C, Bischof H. Semi-supervised on-line boosting for robust tracking. In: Computer Vision–European conference on computer vision. Springer; 2008. p. 234–247.

  21. Dyer KB, Capo R, Polikar R. Compose: a semisupervised learning framework for initially labeled nonstationary streaming data. IEEE Trans Neural Netw Learn Syst. 2014;25(1):12–26.

    Article  PubMed  Google Scholar 

  22. Kveton B, Philipose M, Valko M, Huang L. Online semi-supervised perception: Real-time learning without explicit feedback. In: IEEE Computer society conference on computer vision and pattern recognition workshops (CVPRW); 2010. p. 15–21.

  23. Farajtabar M, Shaban A, Rabiee HR, Rohban MH. Manifold coarse graining for online semi-supervised learning. In: Machine Learning and Knowledge Discovery in Databases. Springer; 2011. p. 391–406.

  24. Goldberg AB, Li M, Zhu XJ. Online manifold regularization: A new learning setting and empirical study. Springer. 2008;393–407.

  25. Goldberg AB, Zhu XJ, Furger A, Xu JM. OASIS: Online active semi-supervised learning. In: Proceedings of the Twenty-Fifth AAAI conference on artificial intelligence; 2011.

  26. Sun BL, Li GH, Jia L, Zhang H. Online manifold regularization by dual ascending procedure. Math Probl Eng. 2013;2013.

  27. Sun BL, Li GH, Jia L, Huang KH. Online coregularization for multiview semisupervised learning. Sci World J. 2013;2013.

  28. Ding SG, Xi XY, Liu ZY, Qiao H, Zhang B. A novel manifold regularized online semi-supervised learning algorithm. In: International conference on neural information processing. Springer; 2016. p. 597–605.

  29. Slater M. Lagrange multipliers revisited. Springer. 2014.

  30. Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res. 2006;7:2399–434.

    Google Scholar 

  31. Schölkopf B, Herbrich R, Smola AJ. A generalized representer theorem. In: Computational learning theory. Springer; 2001. p. 416–426.

  32. Melacci S, Belkin M. Laplacian support vector machines trained in the primal. J Mach Learn Res. 2011; 12:1149–84.

    Google Scholar 

  33. Dekel O, Shalev-Shwartz S, Singer Y. The forgetron: A kernel-based perceptron on a budget. SIAM J Comput. 2008;37(5):1342–72.

    Article  Google Scholar 

  34. Griva I, Nash SG, Sofer A. Linear and nonlinear optimization. 2009.

  35. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.

    Article  Google Scholar 

  36. Heisele B, Poggio T, Pontil M. Face detection in still gray images. AI Memo 1697 Massachusetts Institute of Technology. 2000.

  37. Liu J, Luo J, Shah M. Recognizing realistic actions from videos “in the wild”. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE; 2009. p. 1996–2003.

  38. Wang H, Kläser A, Schmid C, Liu CL. Action recognition by dense trajectories. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE; 2011. p. 3169– 3176.

Download references

Acknowledgment

This work is partly supported by NSFC grants 61375005, U1613213, 61210009, 61627808, 61603389, 61602483, MOST grants 2015BAK35B00, 2015BAK35B01, Guangdong Science and Technology Department grant 2016B090910001, and BNSF grant 4174107.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiyong Liu.

Ethics declarations

Conflict of interests

We declare that we have no conflict of interest.

Human and Animal Rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Appendix

Appendix

In this Appendix, we give out the derivation process of Eq. 13.

For simplicity, we define D and W as

$$\begin{array}{@{}rcl@{}} D_{ij} = \left\{ \begin{array}{lcl} {w_{ij}} &\text{if} &0<i=j<t+1 \\ {{\sum}_{i=1}^{t}w_{it+1}} &\text{if} &i=j=t+1\\ {0} &\text{otherwise} \end{array} \right. \end{array} $$
(23)
$$\begin{array}{@{}rcl@{}} W_{ij} = \left\{ \begin{array}{lcl} {w_{ij}} &\text{if} &0<i<t+1,j=t+1 \\ {w_{ij}} &\text{if} &i=t+1,0<j<t+1 \\ {0} &\text{otherwise} \end{array} \right. \end{array} $$
(24)

Substituting (10), (23), (24) into (12) and letting L = DW, we have

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} {L(\alpha,\xi_{t+1},\gamma_{t+1},\beta_{t+1})}& =& \frac{1}{2}\alpha^T(K+\lambda_1K+\lambda_2KLK)\alpha \\ &&-\gamma_{t+1}(y_{t+1}\alpha^TJ-1+\xi_{t+1})\\ &&-\alpha^TK\tilde{\alpha}^t-\beta_{t+1}\xi_{t+1}+C\xi_{t+1}+c_0 \end{array}\\ \end{array} $$
(25)

where α = [α 1,..., α t+1]T, \(\tilde {\alpha }^{t} = [{\alpha ^{t}_{1}},...,{\alpha ^{t}_{t}},0]^{T}\), K is a (t + 1) × (t + 1) Gram Matrix with K i j = K(x i , x j ), J = K e, e = [0,..., 0, 1]T is a (t + 1)-dimensional vector and c 0 is a constant.

Note that L(α, ξ t+1, γ t+1, β t+1) attains its minimum with respect to α and ξ t+1, if and only if the following conditions are satisfied:

$$\begin{array}{@{}rcl@{}} \nabla_{\alpha}{L(\alpha,\xi_{t+1},\gamma_{t+1},\beta_{t+1})} = 0, \end{array} $$
(26)
$$\begin{array}{@{}rcl@{}} \nabla_{\xi_{t+1}}{L(\alpha,\xi_{t+1},\gamma_{t+1},\beta_{t+1})} = 0. \end{array} $$
(27)

Therefore, we have

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} \frac{\partial L}{\partial \xi_{t+1}} &= -\gamma_{t+1}-\beta_{t+1}+C=0\\ &\Longrightarrow \quad 0\leq\gamma_{t+1}\leq C. \end{array} \end{array} $$
(28)

According to the above identity, we formulate a reduced Lagrangian:

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} L^{R}(\alpha,\gamma_{t+1}) =& \frac{1}{2}\alpha^T(K+\lambda_1K+\lambda_2KLK)\alpha \\ &-\gamma_{t+1}(y_{t+1}\alpha^TJ-1)\\ &-\alpha^TK\tilde{\alpha}^t+c_0. \end{array} \end{array} $$
(29)

Taking derivative of Eq. 29 with respect to α, we have:

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} \frac{\partial L^R}{\partial \alpha} = &(K+\lambda_1K+\lambda_2KLK)\alpha\\ &-K\tilde{\alpha}^t-Jy_{t+1}\gamma_{t+1}. \end{array} \end{array} $$
(30)

Note that L R/ α = 0. Therefore, we have:

$$\begin{array}{@{}rcl@{}} \begin{array}{lllll} \alpha = &(K+\lambda_1K+\lambda_2KLK)^{-1}\times\\ &(K\tilde{\alpha}^t+Jy_{t+1}\gamma_{t+1}). \end{array} \end{array} $$
(31)

Substituting (31) back into the reduced Lagrangian (29), we get:

$$\begin{array}{@{}rcl@{}} \begin{array}{lllll} \underset{\gamma_{t+1}}{\max} &\!-\frac{1}{2}(K\tilde{\alpha}^t\,+\,Jy_{t+1}\gamma_{t+1})^TA^{-1}(K\tilde{\alpha}^t\,+\,Jy_{t+1}\gamma_{t+1})\\ &+\gamma_{t+1}\\ \text{s.t.} &\quad \quad \quad \quad \quad \quad 0\leq\gamma_{t+1}\leq C, \end{array}\\ \end{array} $$
(32)

where A = K + λ 1 K + λ 2 K L K.

Let \(\overline {\gamma }_{t+1}\) be the stationary point of the object function of Eq. 32.

Therefore,

$$\begin{array}{@{}rcl@{}} \begin{array}{lllll} \overline{\gamma}_{t+1} = \frac{1-y_{t+1}J^TA^{-1}K\tilde{\alpha}^t}{J^TA^{-1}J}. \end{array} \end{array} $$
(33)

Assume that the optimal solution of Eq. 32 is γ t+1∗ . Note that the object function (32) is quadratic, so the optimal solution γ t+1∗ in the interval [0, C] is at either 0, C or \(\overline {\gamma }_{t+1}\). Hence

$$\begin{array}{@{}rcl@{}} {\gamma}_{t+1}^{*} = \left\{ \begin{array}{lcl} 0,\quad \text{if} \quad \overline{\gamma}_{t+1}\leq 0 \\ C,\quad \text{if} \quad \overline{\gamma}_{t+1}\geq 0 \\ \overline{\gamma}_{t+1},\quad \text{otherwise} \end{array} \right. \end{array} $$
(34)

Furthermore, if δ t+1 = 0, we can obtain the solution of the proposed model by the similar process as above. Thus, the classifier obtained at time t + 1 is:

$$\begin{array}{@{}rcl@{}} \begin{array}{llllll} &f_{t+1}(x) = \sum\limits_{i=1}^{t+1}\alpha_i^{t+1}K(x_{t+1},x), \\ &h_{t+1} = \text{sign}({f_{t+1}(x)}), \end{array} \end{array} $$
(35)

where

$$\begin{array}{@{}rcl@{}} \begin{array}{lllll} \alpha^{t+1} =A^{-1}(K\tilde{\alpha}^{t}+\delta_{t+1} y_{t+1}\gamma_{t+1}^{*}J). \end{array} \end{array} $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, S., Xi, X., Liu, Z. et al. A Novel Manifold Regularized Online Semi-supervised Learning Model. Cogn Comput 10, 49–61 (2018). https://doi.org/10.1007/s12559-017-9489-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-017-9489-x

Keywords

Navigation