Skip to main content
Log in

Accuracy of suboptimal solutions to kernel principal component analysis

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

For Principal Component Analysis in Reproducing Kernel Hilbert Spaces (KPCA), optimization over sets containing only linear combinations of all n-tuples of kernel functions is investigated, where n is a positive integer smaller than the number of data. Upper bounds on the accuracy in approximating the optimal solution, achievable without restrictions on the number of kernel functions, are derived. The rates of decrease of the upper bounds for increasing number n of kernel functions are given by the summation of two terms, one proportional to n −1/2 and the other to n −1, and depend on the maximum eigenvalue of the Gram matrix of the kernel with respect to the data. Primal and dual formulations of KPCA are considered. The estimates provide insights into the effectiveness of sparse KPCA techniques, aimed at reducing the computational costs of expansions in terms of kernel units.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Achlioptas, D., McSherry, F., Schölkopf, B.: Sampling techniques for kernel methods. In: Proceedings of NIPS 2001, Vancouver, BC, Canada, 3–8 December 2001. Advances in Neural Information Processing Systems, vol. 14, pp. 335–342. MIT Press, Cambridge (2001)

    Google Scholar 

  2. Aronszajn, N.: Theory of reproducing kernels. Trans. AMS 68, 337–404 (1950)

    Article  MATH  MathSciNet  Google Scholar 

  3. Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inf. Theory 39, 930–945 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  4. Berg, C., Christensen, J.P.R., Ressel, P.: Harmonic Analysis on Semigroups. Springer, New York (1984)

    MATH  Google Scholar 

  5. Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 1–25 (1995)

    Google Scholar 

  6. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2003, first published in 2000)

    Google Scholar 

  7. Cucker, F., Smale, S.: On the mathematical foundations of learning. Bull. AMS 39, 1–49 (2001)

    Article  MathSciNet  Google Scholar 

  8. Dahlquist, G., Bjorck, A.: Numerical Methods in Scientific Computing. SIAM, Philadelphia (to appear); http://www.mai.liu.se/~akbjo/NMbook.html

  9. Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)

    MathSciNet  Google Scholar 

  10. Dunford, N., Schwartz, J.T.: Linear Operators. Part II: Spectral Theory. Interscience, New York (1963)

    MATH  Google Scholar 

  11. Friedman, A.: Foundations of Modern Analysis. Dover, New York (1982)

    MATH  Google Scholar 

  12. Girosi, F.: An equivalence between sparse approximation and support vector machines. Neural Comput. 10, 1455–1480 (1998)

    Article  Google Scholar 

  13. Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural networks architectures. Neural Comput. 7, 219–269 (1995)

    Article  Google Scholar 

  14. Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (1986)

    Google Scholar 

  15. Jones, L.K.: A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training. Ann. Stat. 20, 608–613 (1992)

    Article  MATH  Google Scholar 

  16. Kolmogorov, A.N., Fomin, S.V.: Introductory Real Analysis. Dover, New York (1975)

    Google Scholar 

  17. Kůrková, V.: Dimension-independent rates of approximation by neural networks. In: Warwick, K., Kárný, M. (eds.) Computer-Intensive Methods in Control and Signal Processing. The Curse of Dimensionality, pp. 261–270. Birkhäuser, Basel (1997)

    Google Scholar 

  18. Kůrková, V., Sanguineti, M.: Bounds on rates of variable-basis and neural-network approximation. IEEE Trans. Inf. Theory 47, 2659–2665 (2001)

    Article  MATH  Google Scholar 

  19. Kůrková, V., Sanguineti, M.: Comparison of worst case errors in linear and neural network approximation. IEEE Trans. Inf. Theory 48, 264–275 (2002)

    Article  MATH  Google Scholar 

  20. Kůrková, V., Sanguineti, M.: Error estimates for approximate optimization by the extended Ritz method. SIAM J. Optim. 15, 461–487 (2005)

    Article  MATH  Google Scholar 

  21. Kůrková, V., Sanguineti, M.: Learning with generalization capability by kernel methods of bounded complexity. J. Complex. 21, 350–367 (2005)

    Article  MATH  Google Scholar 

  22. Kůrková, V., Savický, P., Hlaváčková, K.: Representations and rates of approximation of real-valued Boolean functions by neural networks. Neural Netw. 11, 651–659 (1998)

    Article  Google Scholar 

  23. Parzen, E.: An approach to time series analysis. Ann. Math. Stat. 32, 951–989 (1961)

    Article  MATH  MathSciNet  Google Scholar 

  24. Pisier, G.: Remarques sur un resultat non publié de B. Maurey. Séminaire d’Analyse Fonctionnelle 1980/81. École Polytechnique, Centre de Mathématiques, Palaiseau, France. Exposé no. V, V. 1–V. 12

  25. Poggio, T., Girosi, F.: Networks for approximation and learning. Proc. IEEE 78, 1481–1497 (1990)

    Article  Google Scholar 

  26. Poggio, T., Girosi, F.: Regularization algorithms for learning that are equivalent to multilayer networks. Science 247, 978–982 (1990)

    Article  MathSciNet  Google Scholar 

  27. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  28. Rudin, W.: Functional Analysis. McGraw-Hill, New York (1973)

    MATH  Google Scholar 

  29. Schölkopf, B., Smola, A.: Learning with Kernels—Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)

    Google Scholar 

  30. Schölkopf, B., Smola, A., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)

    Article  Google Scholar 

  31. Schölkopf, B., Mika, S., Burges, C., Knirsch, P., Müller, K.-R., Rätsch, G., Smola, A.: Input space vs. feature space in kernel-based methods. IEEE Trans. Neural Netw. 10, 1000–1017 (1999)

    Article  Google Scholar 

  32. Schönberg, I.J.: Metric spaces and completely monotone functions. Ann. Math. 39, 811–841 (1938)

    Article  Google Scholar 

  33. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  34. Shawe-Taylor, J., Williams, C.K.I., Cristianini, N., Kandola, J.: On the eigenspectrum of the Gram matrix and the generalization error of kernel-PCA. IEEE Trans. Inf. Theory 51, 2510–2522 (2005)

    Article  MathSciNet  Google Scholar 

  35. Suykens, J.A.K., Van Gestel, T., Vandewalle, J., De Moor, B.: A support vector machine formulation to PCA analysis and its kernel version. IEEE Trans. Neural Netw. 14, 447–450 (2003)

    Article  Google Scholar 

  36. Tikhonov, A.N.: Solutions of incorrectly formulated problems and the regularization method. Sov. Math. Dokl. 4, 1035–1038 (1963)

    Google Scholar 

  37. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Winston, Washington (1977)

    MATH  Google Scholar 

  38. Vasin, V.V.: Relationship of several variational methods for the approximate solution of ill-posed problems. Math. Notes Acad. Sci. USSR 7(3/4), 161–165 (1970) (Translated from Matematicheskie Zametki 7(3), 265–272 (1970))

    Article  MATH  Google Scholar 

  39. Wahba, G.: Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990)

    MATH  Google Scholar 

  40. Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)

    Google Scholar 

  41. Zeidler, E.: Nonlinear Functional Analysis and Its Applications III. Variational Methods and Optimization. Springer, New York (1985)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcello Sanguineti.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gnecco, G., Sanguineti, M. Accuracy of suboptimal solutions to kernel principal component analysis. Comput Optim Appl 42, 265–287 (2009). https://doi.org/10.1007/s10589-007-9108-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-007-9108-y

Keywords

Navigation