Skip to main content
Log in

Regularized learning in Banach spaces as an optimization problem: representer theorems

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

We view regularized learning of a function in a Banach space from its finite samples as an optimization problem. Within the framework of reproducing kernel Banach spaces, we prove the representer theorem for the minimizer of regularized learning schemes with a general loss function and a nondecreasing regularizer. When the loss function and the regularizer are differentiable, a characterization equation for the minimizer is also established.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Argyriou, A., Micchelli, C.A., Pontil, M.: When is there a representer theorem? Vector versus matrix regularizers. Preprint, arXiv:0809.1590v1 (2008)

  2. Aronszajn N.: Theory of reproducing kernels. Trans. Amer. Math. Soc. 68, 337–404 (1950)

    Article  Google Scholar 

  3. Becker T., Weispfenning V.: Gröbner Bases: A Computational Approach to Commutative Algebra. Springer-Verlag, New York (1993)

    Google Scholar 

  4. Bennett, K., Bredensteiner, E.: Duality and geometry in SVM classifier. In: Langley, P. (ed.) Proceeding of the Seventeenth International Conference on Machine Learning, pp. 57–64. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  5. Berlinet A., Thomas-Agnan C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, Norwell, MA (2004)

    Book  Google Scholar 

  6. Boege W., Gebauer R., Kredel H.: Some examples for solving systems of algebraic equations by calculating Gröbner bases. J. Symb. Comput. 1, 83–98 (1986)

    Article  Google Scholar 

  7. Canu, S., Mary, X., Rakotomamonjy, A.: Functional learning through kernel. In: Suykens, J., Horvath, G., Basu, S., Micchelli, C.A., Vandewalle, J. (eds.) Advances in Learning Theory: Methods, Models and Applications. NATO Science Series III: Computer and Systems Sciences, vol. 190, pp. 89–110. IOS Press, Amsterdam (2003)

  8. Conway J.B.: A Course in Functional Analysis, 2nd edn. Springer-Verlag, New York (1990)

    Google Scholar 

  9. Cox D., O’Sullivan F.: Asymptotic analysis of penalized likelihood and related estimators. Ann. Statist. 18, 1676–1695 (1990)

    Article  Google Scholar 

  10. Cucker F., Smale S.: On the mathematical foundations of learning. Bull. Amer. Math. Soc. 39, 1–49 (2002)

    Article  Google Scholar 

  11. Cudia D.F.: On the localization and directionalization of uniform convexity. Bull. Amer. Math. Soc. 69, 265–267 (1963)

    Article  Google Scholar 

  12. Der, R., Lee, D.: Large-margin classification in Banach spaces. JMLR Workshop and Conference Proceedings 2: AISTATS, 91–98 (2007)

    Google Scholar 

  13. Evgeniou T., Pontil M., Poggio T.: Regularization networks and support vector machines. Adv. Comput. Math. 13, 1–50 (2000)

    Article  Google Scholar 

  14. Fabian M. et al.: Functional Analysis and Infinite-Dimensional Geometry. Springer, New York (2001)

    Google Scholar 

  15. Gentile C.: A new approximate maximal margin classification algorithm. J. Mach. Learn. Res. 2, 213–242 (2001)

    Google Scholar 

  16. Giles J.R.: Classes of semi-inner-product spaces. Trans. Amer. Math. Soc. 129, 436–446 (1967)

    Article  Google Scholar 

  17. Hein M., Bousquet O., Schölkopf B.: Maximal margin classification for metric spaces. J. Comput. System Sci. 71, 333–359 (2005)

    Article  Google Scholar 

  18. Kimber D., Long P.M.: On-line learning of smooth functions of a single variable. Theoret. Comput. Sci. 148, 141–156 (1995)

    Article  Google Scholar 

  19. Kimeldorf G., Wahba G.: Some results on Tchebycheffian spline functions. J. Math. Anal. Appl. 33, 82–95 (1971)

    Article  Google Scholar 

  20. Lumer G.: Semi-inner-product spaces. Trans. Amer. Math. Soc. 100, 29–43 (1961)

    Article  Google Scholar 

  21. Megginson R.E.: An Introduction to Banach Space Theory. Springer, New York (1998)

    Book  Google Scholar 

  22. Mercer J.: Functions of positive and negative type and their connection with the theorey of integral equations. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 209, 415–446 (1909)

    Article  Google Scholar 

  23. Micchelli, C.A., Pontil, M.: A function representation for learning in Banach spaces. In: Learning Theory, pp. 255–269. Lecture Notes in Computer Science, 3120, Springer, Berlin (2004)

  24. Micchelli C.A., Pontil M.: Feature space perspectives for learning the kernel. Machine Learning 66, 297–319 (2007)

    Article  Google Scholar 

  25. Moore E.H.: On properly positive Hermitian matrices. Bull. Amer. Math. Soc. 23, 59 (1916)

    Article  Google Scholar 

  26. Moore, E.H.: General Analysis. Memoirs of the American Philosophical Society, Part I (1935), Part II (1939)

  27. Pardalos, P.M., Hansen, P. (eds.): Data Mining and Mathematical Programming. Papers from the workshop held in Montreal, QC, October 10–13, 2006. CRM Proceedings & Lecture Notes 45. American Mathematical Society, Providence, RI (2008)

  28. Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. Proceeding of the Fourteenth Annual Conference on Computational Learning Theory and the Fifth European Conference on Computational Learning Theory, pp. 416–426. Springer-Verlag, London, UK (2001)

  29. Schölkopf B., Smola A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, Mass (2002)

    Google Scholar 

  30. Shawe-Taylor J., Cristianini N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  31. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-posed Problems. V. H. Winston & Sons (distributed by Wiley), New York (1977)

  32. Tropp J.A.: Just relax: convex programming methods for identifying sparse signals in noise. IEEE Trans. Inform. Theory 52, 1030–1051 (2006)

    Article  Google Scholar 

  33. Vapnik V.N.: Statistical Learning Theory. Wiley, New York (1998)

    Google Scholar 

  34. von Luxburg U., Bousquet O.: Distance-based classification with Lipschitz functions. J. Mach. Learn. Res. 5, 669–695 (2004)

    Google Scholar 

  35. Wahba G.: Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In: Schölkopf, B., Burge, C., Smola, A.J. (eds) Advances in Kernel Methods–Support Vector Learning., pp. 69–88. MIT Press, Cambridge, Mass (1999)

    Google Scholar 

  36. Xu Y., Zhang H.: Refinable kernels. J. Mach. Learn. Res. 8, 2083–2120 (2007)

    Google Scholar 

  37. Xu Y., Zhang H.: Refinement of reproducing kernels. J. Mach. Learn. Res. 10, 107–140 (2009)

    Google Scholar 

  38. Young R.M.: An Introduction to Nonharmonic Fourier Series. Academic Press, New York (1980)

    Google Scholar 

  39. Zhang H., Xu Y., Zhang J.: Reproducing kernel Banach spaces for machine learning. J. Mach. Learn. Res. 10, 2741–2775 (2009)

    Google Scholar 

  40. Zhang, H., Zhang, J.: Generalized semi-inner products with applications to regularized learning. J. Math. Anal. Appl., accepted

  41. Zhang T.: On the dual formulation of regularized linear systems with convex risks. Machine Learning 46, 91–129 (2002)

    Article  Google Scholar 

  42. Zhou, D., Xiao, B., Zhou, H., Dai, R.: Global geometry of SVM classifiers. Technical Report 30-5-02. Institute of Automation, Chinese Academy of Sciences (2002)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Zhang.

Additional information

This work was supported in part by the National Science Foundation under grant 0631541 (PI: Jun Zhang).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Zhang, J. Regularized learning in Banach spaces as an optimization problem: representer theorems. J Glob Optim 54, 235–250 (2012). https://doi.org/10.1007/s10898-010-9575-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-010-9575-z

Keywords

Navigation