Skip to main content
Log in

A Primal Framework for Indefinite Kernel Learning

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Kernel methods have been widely applied in machine learning to solve complex nonlinear problems. Kernel selection is one of the key issues in kernel methods, since it is vital for improving generalization performance. Traditionally, the selection of kernel is restricted to be positive definite which makes their applicability partially limited. Actually, in many real applications such as gene identification and object recognition, indefinite kernels frequently emerge and can achieve better performance. However, compared to positive definite ones, indefinite kernels are more complicated due to the non-convexity of the subsequent optimization problems, which leads to the incapability of most existing kernel algorithms. Some indefinite kernel methods have been proposed based on the dual of support vector machine (SVM), which mostly emphasize on how to transform the non-convex optimization to be convex by using positive definite kernels to approximate indefinite ones. In fact, the duality gap in SVM usually exists in the case of indefinite kernels and therefore these algorithms do not indeed solve the indefinite kernel problems themselves. In this paper, we present a novel framework for indefinite kernel learning derived directly from the primal of SVM, which establishes several new models not only for single indefinite kernel but also extends to multiple indefinite kernel scenarios. Several algorithms are developed to handle the non-convex optimization problems in these models. We further provide a constructive approach for kernel selection in the algorithms by using the theory of similarity functions. Experiments on real world datasets demonstrate the superiority of our models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Here “stabilize” means finding a stationary point in a RKKS.

References

  1. Ackermann MR, Blömer J, Sohler C (2010) Clustering for metric and nonmetric distance measures. ACM Trans Algorithms 6(4):59

    Article  MathSciNet  MATH  Google Scholar 

  2. Aiolli F, Donini M (2015) EasyMKL: a scalable multiple kernel learning algorithm. Neurocomputing 169:215–224

    Article  Google Scholar 

  3. Alabdulmohsin IM, Gao X, Zhang X (2014) Support vector machines with indefinite kernels. In: Proceedings of 6th Asian conference on machine learning

  4. Anzai Y (2012) Pattern recognition and machine learning. Elsevier, Amsterdam

    MATH  Google Scholar 

  5. Balcan MF, Blum A, Srebro N (2008) A theory of learning with similarity functions. Mach Learn 1–2:89–112

    Article  Google Scholar 

  6. Chapelle O (2007) Training a support vector machine in the primal. Neural Comput 5:1155–1178

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen J, Ye J (2008) Training SVM with indefinite kernels. In: Proceedings of the 25th international conference on machine learning. ACM, pp 136–143

  8. Chung W, Kim J, Lee H, Kim E (2015) General dimensional multiple-output support vector regressions and their multiple kernel learning. IEEE Trans Cybern 11:2572–2584

    Article  Google Scholar 

  9. Cortes C, Mohri M, Rostamizadeh A (2009) Learning non-linear combinations of kernels. In: Proceedings of 23rd conference on Advances in neural information processing systems, pp 396–404 (2009)

  10. Donini M, Aiolli F (2016) Learning deep kernels in the space of dot product polynomials. Mach Learn 106:1–25

    MathSciNet  MATH  Google Scholar 

  11. Duchi J, Shalev-Shwartz S, Singer Y, Chandra T (2008) Efficient projections onto the l 1-ball for learning in high dimensions. In: Proceedings of the 25th international conference on machine learning. ACM, pp 272–279 (2008)

  12. Fan Q, Gao D, Wang Z (2016) Multiple empirical kernel learning with locality preserving constraint. Knowl Based Syst 105:107–118

    Article  Google Scholar 

  13. Fan Q, Wang Z, Zha H, Gao D (2017) MREKLM: a fast multiple empirical kernel learning machine. Pattern Recognit 61:197–209

    Article  Google Scholar 

  14. Gönen M, Alpaydın E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12(Jul):2211–2268

    MathSciNet  MATH  Google Scholar 

  15. Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K (1999) Classification on pairwise proximity data. Adv Neural Inf Process Syst 11:438–444

    Google Scholar 

  16. Gu S, Guo Y (2012) Learning SVM classifiers with indefinite kernels. In: Proceedings of the 27th AAAI conference on artificial intelligence

  17. Haasdonk B (2005) Feature space interpretation of svms with indefinite kernels. IEEE Trans Pattern Ana Mach Intell 4:482–492

    Article  Google Scholar 

  18. Haasdonk B, Pekalska E (2008) Indefinite kernel fisher discriminant. In: Proceedings of 19th international conference on pattern recognition, pp 1–4 (2008)

  19. Haasdonk B, Pkalska E (2010) Indefinite kernel discriminant analysis. In: Proceedings of international conference on computational statistic. Springer, pp 221–230 (2010)

  20. Han Y, Yang K, Ma Y, Liu G (2014) Localized multiple kernel learning via sample-wise alternating optimization. IEEE Trans Cybern 1:137–148

    Article  Google Scholar 

  21. Hao Z, Yuan G, Yang X, Chen Z (2013) A primal method for multiple kernel learning. Neural Comput Appl 3–4:975–987

    Article  Google Scholar 

  22. Hinrichs C, Singh V, Peng J, Johnson S (2012) Q-MKL: matrix-induced regularization in multi-kernel learning with applications to neuroimaging. In: Proceedings of 26th conference on Advances in neural information processing systems, pp 1421–1429 (2012)

  23. Hoi SC, Jin R, Zhao P, Yang T (2013) Online multiple kernel classification. Mach Learn 2:289–316

    Article  MathSciNet  MATH  Google Scholar 

  24. Huang J, Xue H, Zhai Y (2012) Semi-supervised discriminatively regularized classifier with pairwise constraints. In: Pacific Rim international conference on artificial intelligence. Springer, pp 112–123 (2012)

  25. Huang X, Maier A, Hornegger J, Suykens JA (2016) Indefinite kernels in least squares support vector machines and principal component analysis. Appl Comput Harmon Anal 43:162–172

    Article  MathSciNet  MATH  Google Scholar 

  26. Jacobs DW, Weinshall D, Gdalyahu Y (2000) Classification with nonmetric distances: Image retrieval and class representation. IEEE Trans Pattern Anal Mach Intell 6:583–600

    Article  Google Scholar 

  27. Jin R, Yang T, Mahdavi M (2013) Sparse multiple kernel learning with geometric convergence rate. arXiv preprint arXiv:1302.0315

  28. Kloft M, Brefeld U, Laskov P, Müller KR, Zien A, Sonnenburg S (2009) Efficient and accurate LP-norm multiple kernel learning. In: Proceedings of 23rd conference on Advances in neural information processing systems, pp 997–1005 (2009)

  29. Kowalski M, Szafranski M, Ralaivola L (2009) Multiple indefinite kernel learning with mixed norm regularization. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 545–552 (2009)

  30. Kumar A, Niculescu-Mizil A, Kavukcuoglu K, Daume III H (2012) A binary classification framework for two-stage multiple kernel learning. arXiv preprint arXiv:1206.6428

  31. Li BYS, Yeung LF, Ko KT (2015) Indefinite kernel ridge regression and its application on QSAR modelling. Neurocomputing 18:127–133

    Article  Google Scholar 

  32. Liu C (2004) Gabor-based kernel pca with fractional power polynomial models for face recognition. IEEE Trans Pattern Anal Mach Intell 5:572–581

    Google Scholar 

  33. Liu F, Xue X (2016) Subgradient-based neural network for nonconvex optimization problems in support vector machines with indefinite kernels. J Ind Manag Optim 1:285–301

    MathSciNet  MATH  Google Scholar 

  34. Liwicki S, Zafeiriou S, Tzimiropoulos G, Pantic M (2012) Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Trans Neural Netw Learn Syst 10:1624–1636

    Article  Google Scholar 

  35. Loosli G, Canu S, Ong CS (2016) Learning SVM in Kreǐn spaces. IEEE Trans Pattern Anal Mach Intell 6:1204–1216

    Article  Google Scholar 

  36. Luss R, d’Aspremont A (2008) Support vector machine classification with indefinite kernels. In: Proceedings of 22nd conference on Advances in neural information processing systems, pp 953–960

  37. Melacci S, Belkin M (2011) Laplacian support vector machines trained in the primal. J Mach Learn Res 12(Mar):1149–1184

    MathSciNet  MATH  Google Scholar 

  38. Ong CS, Mary X, Canu S, Smola AJ (2004) Learning with non-positive kernels. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 81 (2004)

  39. Ong CS, Smola AJ, Williamson RC (2005) Learning the kernel with hyperkernels. J Mach Learn Res 6(Jul):1043–1071

    MathSciNet  MATH  Google Scholar 

  40. Pavlidis P, Weston J, Cai J, Grundy WN (2001) Gene functional classification from heterogeneous data. In: Proceedings of the fifth annual international conference on computational biology. ACM, pp 249–255 (2001)

  41. Pekalska E, Haasdonk B (2009) Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans Pattern Anal Mach Intell 6:1017–1032

    Article  Google Scholar 

  42. Pekalska E, Harol A, Duin RP, Spillmann B, Bunke H (2006) Non-euclidean or non-metric measures can be informative. In: Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR). Springer, pp 871–880 (2006)

  43. Pekalska E, Paclik P, Duin RP (2001) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res 2(Dec):175–211

    MathSciNet  MATH  Google Scholar 

  44. Rakotomamonjy A, Bach FR, Canu S, Grandvalet Y (2008) SimpleMKL. J Mach Learn Res 9(Nov):2491–2521

    MathSciNet  MATH  Google Scholar 

  45. Rätsch G, Onoda T, Müller KR (2001) Soft margins for adaboost. Mach Learn 3:287–320

    Article  MATH  Google Scholar 

  46. Roth V, Laub J, Buhmann JM, Müller KR (2003) Going metric: denoising pairwise data. Adv Neural Inf Process Syst 15:841–848

    Google Scholar 

  47. Roth V, Laub J, Kawanabe M, Buhmann JM (2003) Optimal cluster preserving embedding of nonmetric proximity data. IEEE Trans Pattern Anal Mach Intell 12:1540–1551

    Article  Google Scholar 

  48. Ruszczyński AP (2006) Nonlinear optimization. Princeton University Press, Princeton

    MATH  Google Scholar 

  49. Schleif FM, Gisbrecht A, Tino P (2015) Large scale indefinite kernel fisher discriminant. In: International workshop on similarity-based pattern recognition. Springer, pp 160–170 (2015)

  50. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  51. Wang Z, Chen S, Xue H, Pan Z (2010) A novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Process Lett 3:159–175

    Article  Google Scholar 

  52. Wright S, Nocedal J (1999) Numerical optimization. Springer Sci 35:67–68

    MATH  Google Scholar 

  53. Xu Z, Jin R, Yang H, King I, Lyu MR (2010) Simple and efficient multiple kernel learning by group lasso. In: Proceedings of the 27th international conference on machine learning, pp 1175–1182 (2010)

  54. Xue H, Chen S (2014) Discriminality-driven regularization framework for indefinite kernel machine. Neurocomputing 133:209–221

    Article  Google Scholar 

  55. Xue H, Chen S, Huang J (2012) Discriminative indefinite kernel classifier from pairwise constraints and unlabeled data. In: Proceedings of 21st international conference on pattern recognition. IEEE, pp 497–500 (2012)

  56. Yan S, Xu X, Xu D, Lin S, Li X (2015) Image classification with densely sampled image windows and generalized adaptive multiple kernel learning. IEEE Trans Cybern 3:381–390

    Article  Google Scholar 

  57. Ying Y, Campbell C, Girolami M (2009) Analysis of SVM with indefinite kernels. In: Proceedings of 23rd conference on Advances in neural information processing systems, pp 2205–2213 (2009)

  58. Zheng D, Wang J, Zhao Y (2006) Non-flat function estimation with a multi-scale support vector regression. Neurocomputing 1:420–429

    Article  Google Scholar 

  59. Zien A, Ong CS (2007) Multiclass multiple kernel learning. In: Proceedings of the 24th international conference on machine learning. ACM, pp 1191–1198

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Xue.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Natural Science Foundations of China (Grant Nos. 61375057, 61300165 and 61403193), the Natural Science Foundation of Jiangsu Province of China (Grant No. BK20131298) and the National Key Research and Development Program of China (Grant Nos. 2016YFC1306700 and 2016YFC1306704). It was also supported by Collaborative Innovation Center of Wireless Communications Technology.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, H., Wang, L., Chen, S. et al. A Primal Framework for Indefinite Kernel Learning. Neural Process Lett 50, 165–188 (2019). https://doi.org/10.1007/s11063-019-10019-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-019-10019-7

Keywords

Navigation