Reducing Kernel Matrix Diagonal Dominance Using Semi-definite Programming

Kandola, Jaz; Graepel, Thore; Shawe-Taylor, John

doi:10.1007/978-3-540-45167-9_22

Jaz Kandola⁸,
Thore Graepel⁹ &
John Shawe-Taylor⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

5327 Accesses
2 Citations

Abstract

Kernel-based learning methods revolve around the notion of a kernel or Gram matrix between data points. These square, symmetric, positive semi-definite matrices can informally be regarded as encoding pairwise similarity between all of the objects in a data-set. In this paper we propose an algorithm for manipulating the diagonal entries of a kernel matrix using semi-definite programming. Kernel matrix diagonal dominance reduction attempts to deal with the problem of learning with almost orthogonal features, a phenomenon commonplace in kernel matrices derived from string kernels or Gaussian kernels with small width parameter. We show how this task can be formulated as a semi-definite programming optimization problem that can be solved with readily available optimizers. Theoretically we provide an analysis using Rademacher based bounds to provide an alternative motivation for the 1-norm SVM motivated from kernel diagonal reduction. We assess the performance of the algorithm on standard data sets with encouraging results in terms of approximation and prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Google Scholar
Schölkopf, B., Smola, A.: Learning With Kernels – Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)
Google Scholar
Herbrich, R.: Learning Kernel Classifiers. MIT Press, Cambridge (2002)
Google Scholar
Kondor, R.I., Lafferty, J.: Diffusion Kernels on Graphs and Other Discrete Structures. In: Proceedings of Intenational Conference on Machine Learning (ICML 2002) (2002)
Google Scholar
Lanckriet, G., Cristianini, N., Bartlett, P., El-Ghoui, L., Jordan, M.I.: Learning the Kernel Matrix using Semi-Definite Programming. In: International Conference on Machine Learning (ICML 2002) (2002)
Google Scholar
Vanderberghe, L., Boyd, S.: Semidefinite programming. SIAM Review. A Publication of the Society for Industrial and Applied Mathematics, 49–95 (1996)
Google Scholar
Saitoh, S.: Theory of Reproducing Kernels and its Applications. Longman Scientific & Technical (1988)
Google Scholar
Todd, M.J.: Semidefinite Programming, Technical report: Cornell University (2000)
Google Scholar
Haussler, D.: Convolutional Kernels on Discrete Structures. Technical Report: Computer Science Department, University of California at Santa Cruz (1999)
Google Scholar
Watkins, C.: Dynamic Alignment Kernels. Advances in Large Margin Classifiers. MIT Press, Cambridge (2000)
Google Scholar
Schölkopf, B., Weston, J., Eskin, E., Les lie, C., Noble, W.: A Kernel Approach for Learning from almost Orthogonal Patterns. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, p. 511. Springer, Heidelberg (2002)
Google Scholar
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. JACM, 1115–1145 (1995)
Google Scholar
Wolkowicz, H., Anjoz, M.F.: Semi-definite Programming for Discrete Optimisation and Matrix Completion Problems. Technical Report: University of Waterloo (2000)
Google Scholar
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Theory (1998)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Margin Distribution Bounds on Generalization. In: Fischer, P., Simon, H.U. (eds.) EuroCOLT 1999. LNCS (LNAI), vol. 1572, p. 263. Springer, Heidelberg (1999)
Chapter Google Scholar
Kandola, J., Shawe-Taylor, J.: Spectral Clustering using Diagonally Reduced Gram Matrices. Submitted to Neural Information Processing Systems 16 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Computer Science, Royal Holloway, University of London, UK
Jaz Kandola & John Shawe-Taylor
Microsoft Research, Cambridge, UK
Thore Graepel

Authors

Jaz Kandola
View author publications
You can also search for this author in PubMed Google Scholar
Thore Graepel
View author publications
You can also search for this author in PubMed Google Scholar
John Shawe-Taylor
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MPI for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf
University of California, Santa Cruz
Manfred K. Warmuth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kandola, J., Graepel, T., Shawe-Taylor, J. (2003). Reducing Kernel Matrix Diagonal Dominance Using Semi-definite Programming. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-45167-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics