Skip to main content
Log in

Online regularized learning with pairwise loss functions

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

Recently, there has been considerable work on analyzing learning algorithms with pairwise loss functions in the batch setting. There is relatively little theoretical work on analyzing their online algorithms, despite of their popularity in practice due to the scalability to big data. In this paper, we consider online learning algorithms with pairwise loss functions based on regularization schemes in reproducing kernel Hilbert spaces. In particular, we establish the convergence of the last iterate of the online algorithm under a very weak assumption on the step sizes and derive satisfactory convergence rates for polynomially decaying step sizes. Our technique uses Rademacher complexities which handle function classes associated with pairwise loss functions. Since pairwise learning involves pairs of examples, which are no longer i.i.d., standard techniques do not directly apply to such pairwise learning algorithms. Hence, our results are a non-trivial extension of those in the setting of univariate loss functions to the pairwise setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Agarwal, S., Niyogi, P.: Generalization bounds for ranking algorithms via algorithmic stability. J. Mach. Learn. Res. 10, 441–474 (2009)

    MathSciNet  MATH  Google Scholar 

  2. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bartlett, P.L., Bousquet, O., Mendelson, S.: Local Rademacher complexities. Ann. Stat. 33, 1497–1537 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2002)

    MathSciNet  MATH  Google Scholar 

  5. Bellet, A., Habrard, A., Sebban, M.: Similarity learning for provably accurate sparse linear classification. ICML (2012)

  6. Cao, Q., Guo, Z.C., Ying, Y.: Generalization bounds for metric and similarity learning. Machine Learning Journal 102(1), 115–132 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cesa-Bianchi, C., Gentile, C.: Improved risk tail bounds for online algorithms. IEEE Trans. Inf. Theory 54(1), 286–390 (2008)

    Article  MATH  Google Scholar 

  8. Chen, H., Pan, Z., Li, L.: Learning performance of coefficient-based regularized ranking. Neurocomputing 133, 54–62 (2014)

    Article  Google Scholar 

  9. Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)

    MathSciNet  MATH  Google Scholar 

  10. Clémencon, S., Lugosi, G., Vayatis, N.: Ranking and empirical minimization of U-statistics. Ann. of Stat. 36, 844–874 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cucker, F., Zhou, D.-X.: Learning Theory: An Approximation Theory Viewpoint. Cambridge Univesity Press (2007)

  12. Davis, J., Kulis, B., Jain, P., Sra, S., Dhillon, I.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning (ICML) (2007)

  13. Fan, J., Hu, T., Wu, Q., Zhou, D. X.: Consistency analysis of an empirical minimum error entropy algorithm, Applied and Computational Harmonic Analysis 41(1), 161–189 (2016). doi:10.1016/j.acha.2014.12.005

  14. Guo, Z. C., Ying, Y.: Guaranteed classification via regularized similarity learning. Neural Comput. 26, 497–522 (2014)

    Article  MathSciNet  Google Scholar 

  15. Hu, T., Fan, J., Wu, Q., Zhou, D.X.: Regularization schemes for minimum error entropy principle. Anal. Appl. 13, 437–455 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kar, P., Sriperumbudur, B., Jain, P., Karnick, H.: On the generalization ability of online learning algorithms for pairwise loss functions. In: Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)

  17. Jin, R., Wang, S., Zhou, Y.: Regularized distance metric learning: theory and algorithm. In: Advances in Neural Information Processing Systems (NIPS) (2009)

  18. Ledoux, M., Talagrand, M.: Probability in Banach Spaces: isoperimetry and processes. Springer (1991)

  19. McDiarmid, C.: Surveys in Combinatorics, Chapter on the methods of bounded differences, pp 148–188. Cambridge University Press, Cambridge (UK) (1989)

    Google Scholar 

  20. Meir, R., Zhang, T.: Generalization error bounds for Bayesian mixture algorithms. J. Mach. Learn. Res. 4, 839–860 (2003)

    MathSciNet  MATH  Google Scholar 

  21. Mukherjee, S., Wu, Q.: Estimation of gradients and coordinate covariation in classification. J. Mach. Learn. Res. 7, 2481–2514 (2006)

    MathSciNet  MATH  Google Scholar 

  22. Mukherjee, S., Zhou, D. X.: Learning coordinate covariances via gradients. J. Mach. Learn. Res. 7, 519–549 (2006)

    MathSciNet  MATH  Google Scholar 

  23. Pinelis, I.: Optimum bounds for the distributions of martingales in banach spaces. Ann. Prob. 22, 1679–1706 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  24. Rakhlin, A., Shamir, O., Sridharan, K.: Making gradient descent optimal for strongly convex stochastic optimization. In: Proceedings of the 29th International Conference on Machine Learning (ICML) (2012)

  25. Rejchel, W.: On ranking and generalization bounds. J. Mach. Learn. Res. 13, 1373–1392 (2012)

    MathSciNet  MATH  Google Scholar 

  26. Shamir, O., Zhang, T.: Stochastic gradient descent for non-smooth optimization: convergence results and optimal averaging schemes. In: Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)

  27. Smale, S., Yao, Y.: Online learning algorithms. Found. Comput. Math. 6, 145–170 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  28. Sridharan, K., Srebro, N., Shalev-Shwartz, S.: Fast rates for regularized objectives Advances in Neural Information Processing Systems (NIPS) (2008)

  29. Steinwart, I., Christmann, A.: Support Vector Machines. Springer-Verlag, New York (2008)

    MATH  Google Scholar 

  30. Weinberger, K. Q., Blitzer, J., Saul, L.: Distance metric learning for large margin nearest neighbour classification. In: Advances in Neural Information Processing Systems (NIPS) (2005)

  31. Vitter, J. S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  32. Wang, Y., Khardon, R., Pechyony, D., Jones, R.: Generalization bounds for online learning algorithms with pairwise loss functions. COLT (2012)

  33. Wang, Y., Khardon, R., Pechyony, D., Jones, R.: Online learning with pairwise loss functions. ArXiv Preprint (2013). arXiv:1301.5332

  34. Wu, Q., Zhou, D.X.: Analysis of support vector machine classification. J. Comput. Anal. Appl. 8(2), 99–119 (2006)

    MathSciNet  MATH  Google Scholar 

  35. Ying, Y., Li, P.: Distance metric learning with eigenvalue optimization. J. Mach. Learn. Res. 13, 1–26 (2012)

    MathSciNet  MATH  Google Scholar 

  36. Ying, Y., Pontil, M.: Online gradient descent algorithms. Found. Comput. Math. 5, 561–596 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  37. Ying, Y., Zhou, D.X.: Online regularized classification algorithms. IEEE Trans. Inf. Theory 11, 4775–4788 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  38. Zhao, P., Hoi, S.C.H., Jin, R., Yang, T.: Online AUC Maximization. In: Proceedings of the 28th International Conference on Machine Learning (ICML) (2011)

  39. Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. of Stat. 32, 56–85 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiming Ying.

Additional information

Communicated by: Karsten Urban

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, ZC., Ying, Y. & Zhou, DX. Online regularized learning with pairwise loss functions. Adv Comput Math 43, 127–150 (2017). https://doi.org/10.1007/s10444-016-9479-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10444-016-9479-7

Keywords

Mathematics Subject Classfication (2010)

Navigation