skip to main content
10.1145/1281192.1281283acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

Published: 12 August 2007 Publication History

Abstract

The kernel function plays a central role in kernel methods. In this paper, we consider the automated learning of the kernel matrix over a convex combination of pre-specified kernel matrices in Regularized Kernel Discriminant Analysis (RKDA), which performs lineardiscriminant analysis in the feature space via the kernel trick. Previous studies have shown that this kernel learning problem can be formulated as a semidefinite program (SDP), which is however computationally expensive, even with the recent advances in interior point methods. Based on the equivalence relationship between RKDA and least square problems in the binary-class case, we propose a Quadratically Constrained Quadratic Programming (QCQP) formulation for the kernel learning problem, which can be solved more efficiently than SDP. While most existing work on kernel learning deal with binary-class problems only, we show that our QCQP formulation can be extended naturally to the multi-class case. Experimental results on both binary-class and multi-class benchmarkdata sets show the efficacy of the proposed QCQP formulations.

Supplementary Material

JPG File (p854-ye-200.jpg)
JPG File (p854-ye-768.jpg)
Low Resolution (p854-ye-200.mov)
High Resolution (p854-ye-768.mov)

References

[1]
E. D. Andersen and K. D. Andersen. The MOSEK interior point optimizer for linear programming: an implementation of the homogeneous algorithm. In T. T. H. Frenk, K. Roos and S. Zhang, editors, High Performance Optimization, pages 197--232. Kluwer Academic Publishers, 2000.
[2]
A. Argyriou, R. Hauser, C. Micchelli, and M. Pontil. A DC-programming algorithm for kernel selection. In Proceedings of the International Conference on Machine Learning, pages 41--48, 2006.
[3]
F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the International Conference on Machine Learning, 2004.
[4]
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[5]
N. Cristianini and J. Taylor. An Introduction to Support Vector Machines and other Kernel-based Learning Methods. Cambridge University Press, 2000.
[6]
G. Fung, M. Dundar, J. Bi, and B. Rao. A fast iterative algorithm for Fisher discriminant using heterogeneous kernels. In Proceedings of the International Conference on Machine Learning, 2004.
[7]
G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
[8]
J. J. Hull. A database for handwritten text recognition research. IEEE Trans. Pattern Analysis Machine Intelligence, 16(5):550--554, 1994.
[9]
T. Jebara. Multi-task feature and kernel selection for SVMs. In Proceedings of the International Conference on Machine Learning, 2004. Research Track Paper
[10]
S.-J. Kim, A. Magnani, and S. Boyd. Optimal kernel selection in kernel Fisher discriminant analysis. In Proceedings of the International Conference on Machine Learning, pages 465--472, 2006.
[11]
G. Lanckriet, T. D. Bie, N. Cristianini, M. Jordan, and W. Noble. A statistical framework for genomic data fusion. Bioinformatics, 20(16):2626--2635, 2004.
[12]
G. Lanckriet, N. Cristianini, P. Bartlett, L. E. Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27--72, 2004.
[13]
D. Lewis, T. Jebara, and W. S. Noble. Nonstationary kernel combination. In Proceedings of the International Conference on Machine Learning, pages 553--560, 2006.
[14]
M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret. Applications of second-order cone programming. Linear Algebra and its Applications, 284:193-228, 1998.
[15]
S. Mika. Kernel Fisher Discriminants. PhD thesis, University of Technology, Berlin, Oct. 2002.
[16]
S. Mika, G. Rätsch, and K.-R. Müller. A mathematical programming approach to the kernel fisher algorithm. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Proceedings of the Annual Conference on Neural Information Processing Systems, pages 591--597. MIT Press, 2001.
[17]
S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41--48. IEEE, 1999.
[18]
Y. Nesterov and A. Nemirovskii. Interior-point polynomial algorithms in convex programming. SIAM Studies in Applied Mathematics. SIAM, 1994.
[19]
D. Newman, S. Hettich, C. Blake, and C. Merz. UCI repository of machine learning databases, 1998.
[20]
J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In Advances in kernel methods: support vector learning, pages 185--208. MIT Press, Cambridge, MA, USA, 1999.
[21]
S. Schölkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, 2002.
[22]
J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.
[23]
S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf. Large Scale Multiple Kernel Learning. Journal of Machine Learning Research, 7:1531--1565, July 2006.
[24]
J. F. Sturm. Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optimization Methods and Software, 11-12:625--653, 1999.
[25]
L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 38:49--95, 1996.
[26]
V. Vapnik. Statistical learning theory. Wiley, New York, 1998.
[27]
J. Ye. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 6:483--502, 2005.
[28]
J. Ye, J. Chen, and S. Ji. Discriminant kernel and regularization parameter learning via semidefinite programming. In Proceedings of the International Conference on Machine Learning, 2007.
[29]
J. Ye, J. Chen, Q. Li, and S. Kumar. Classification of drosophila embryonic developmental stage range based on gene expression pattern images. In Proceedings of the Computational Systems Bioinformatics Conference, pages 293--298, 2006.
[30]
A. Zien and C. Ong. Multiclass multiple kernel learning. In Proceedings of the International Conference on Machine Learning, 2007.

Cited By

View all
  • (2017)Large-scale quadratically constrained quadratic program via low-discrepancy sequencesProceedings of the 31st International Conference on Neural Information Processing Systems10.5555/3294771.3294990(2294-2304)Online publication date: 4-Dec-2017
  • (2014)Multiple-kernel based soft subspace fuzzy clustering2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)10.1109/FUZZ-IEEE.2014.6891589(186-193)Online publication date: Jul-2014
  • (2013)An Overview of Unsupervised and Semi-Supervised Fuzzy Kernel ClusteringInternational Journal of Fuzzy Logic and Intelligent Systems10.5391/IJFIS.2013.13.4.25413:4(254-268)Online publication date: 30-Dec-2013
  • Show More Cited By

Index Terms

  1. Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2007
    1080 pages
    ISBN:9781595936097
    DOI:10.1145/1281192
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. convex optimization
    2. kernel discriminant analysis
    3. kernel learning
    4. model selection
    5. quadratically constrained quadratic programming

    Qualifiers

    • Article

    Conference

    KDD07

    Acceptance Rates

    KDD '07 Paper Acceptance Rate 111 of 573 submissions, 19%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Large-scale quadratically constrained quadratic program via low-discrepancy sequencesProceedings of the 31st International Conference on Neural Information Processing Systems10.5555/3294771.3294990(2294-2304)Online publication date: 4-Dec-2017
    • (2014)Multiple-kernel based soft subspace fuzzy clustering2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)10.1109/FUZZ-IEEE.2014.6891589(186-193)Online publication date: Jul-2014
    • (2013)An Overview of Unsupervised and Semi-Supervised Fuzzy Kernel ClusteringInternational Journal of Fuzzy Logic and Intelligent Systems10.5391/IJFIS.2013.13.4.25413:4(254-268)Online publication date: 30-Dec-2013
    • (2012)Fuzzy clustering with multiple kernels in feature space2012 IEEE International Conference on Fuzzy Systems10.1109/FUZZ-IEEE.2012.6251146(1-8)Online publication date: Jun-2012
    • (2011)Generalized Fisher score for feature selectionProceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence10.5555/3020548.3020580(266-273)Online publication date: 14-Jul-2011
    • (2011)Relational Fuzzy Clustering with Multiple KernelsProceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops10.1109/ICDMW.2011.145(488-495)Online publication date: 11-Dec-2011
    • (2011)Fuzzy clustering with Learnable Cluster dependent Kernels2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011)10.1109/FUZZY.2011.6007411(2521-2527)Online publication date: Jun-2011
    • (2011)Robust kernel discriminant analysis using fuzzy membershipsPattern Recognition10.1016/j.patcog.2010.10.00744:3(716-723)Online publication date: 1-Mar-2011
    • (2011)Non-uniform multiple kernel learning with cluster-based gating functionsNeurocomputing10.1016/j.neucom.2010.11.00174:7(1095-1101)Online publication date: 1-Mar-2011
    • (2010)Financial Forecasting with Gompertz Multiple Kernel LearningProceedings of the 2010 IEEE International Conference on Data Mining10.1109/ICDM.2010.68(983-988)Online publication date: 13-Dec-2010
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media