An unsupervised approach to learn the kernel functions: from global influence to local similarity

Abbasnejad, M. Ehsan; Ramachandram, Dhanesh; Mandava, Rajeswari

doi:10.1007/s00521-010-0411-7

An unsupervised approach to learn the kernel functions: from global influence to local similarity

Original Article
Published: 20 June 2010

Volume 20, pages 703–715, (2011)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

M. Ehsan Abbasnejad¹,
Dhanesh Ramachandram¹ &
Rajeswari Mandava¹

232 Accesses
2 Citations
Explore all metrics

Abstract

Recently there has been a steep growth in the development of kernel-based learning algorithms. The intrinsic problem in such algorithms is the selection of the optimal kernel for the learning task of interest. In this paper, we propose an unsupervised approach to learn a linear combination of kernel functions, such that the resulting kernel best serves the objectives of the learning task. This is achieved through measuring the influence of each point on the structure of the dataset. This measure is calculated by constructing a weighted graph on which a random walk is performed. The measure of influence in the feature space is probabilistically related to the input space that yields an optimization problem to be solved. The optimization problem is formulated in two different convex settings, namely linear and semidefinite programming, dependent on the type of kernel combination considered. The contributions of this paper are twofold: first, a novel unsupervised approach to learn the kernel function, and second, a method to infer the local similarity represented by the kernel function by measuring the global influence of each point toward the structure of the dataset. The proposed approach focuses on the kernel selection which is independent of the kernel-based learning algorithm. The empirical evaluation of the proposed approach with various datasets shows the effectiveness of the algorithm in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Manifold regularized multiple kernel learning with Hellinger distance

Article 16 March 2018

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Article 11 April 2017

Kernel Learning with Hilbert-Schmidt Independence Criterion

Notes

Available at: http://www.people.csail.mit.edu/jrennie/20Newsgroups/.

References

Amari S, Wu S (1999) Improving support vector machine classifiers by modifying kernal functions. Neural Netw 12(6):783–789. doi:10.1016/S0893-6080(99)00032-5
Bach FR, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the smo algorithm. In: ICML ’04: proceedings of the twenty-first international conference on Machine learning. ACM, New York, NY, USA, p 6. doi:10.1145/1015330.1015424
Ben-Hur A, Horn D, Siegelmann HT, Vapnik V (2002) Support vector clustering. J Mach Learn Res 2:125–137
Article MATH Google Scholar
Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167
Article Google Scholar
Burges CJC (1999) Geometry and invariance in kernel based methods. pp 89–116
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at: http://www.csie.ntu.edu.tw/cjlin/libsvm
Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21(1):5–30. doi:10.1016/j.acha.2006.04.006. http://www.dx.doi.org/10.1016/j.acha.2006.04.006
Cristianini N, Shawe-taylor J, Elissee A, Kandola J (2002) On kernel-target alignment. In: Advances in neural information processing systems 14. MIT Press, pp 367–373
Duan K, Keerthi SS, Poo AN (2003) Evaluation of simple performance measures for tuning svm hyperparameters. Neurocomputing 51:41–59. doi:10.1016/S0925-2312(02)00601-X. http://www.sciencedirect.com/science/article/B6V10-4625PVR-1/2/cc81e4581eca9413c3784c75c1ba0ee1
Google Scholar
Graf A, Smola A, Borer S (2003) Classification in a normalized feature space using support vector machines. Neural Netw, IEEE Trans 14(3):597–605. doi:10.1109/TNN.2003.811708
Article Google Scholar
Grant M, Boyd S (2008) Graph implementations for nonsmooth convex programs. http://www.dx.doi.org/10.1007/978-1-84800-155-8_7
Grant M, Boyd S (2009) Cvx: Matlab software for disciplined convex programming web page and software. http://www.stanford.edu/boyd/cvx
Guo Y, Gao J, Kwan P (2008) Twin kernel embedding. Pattern Anal Mach Intell, IEEE Trans 30(8):1490–1495. doi:10.1109/TPAMI.2008.74
Article Google Scholar
Herbster M, Pontil M, Wainer L (2005) Online learning over graphs. In: ICML ’05: proceedings of the 22nd international conference on Machine learning. ACM, New York, NY, USA, pp 305–312. http://www.doi.acm.org/10.1145/1102351.1102390
Kandola J, Shawe-Taylor J, Cristianini N (2002) On the extensions of kernel alignment. NC-TR-2002-120. http://www.eprints.ecs.soton.ac.uk/9745
Kandola J, Shawe-Taylor J, Cristianini N (2002) Optimizing kernel alignment over combinations of kernel. http://www.eprints.ecs.soton.ac.uk/9746/
Kim SJ, Zymnis A, Magnani A, Koh K, Boyd S (2008) Learning the kernel via convex optimization. pp 1997–2000. doi:10.1109/ICASSP.2008.4518030
Kondor RI, Lafferty J (2002) Diffusion kernels on graphs and other discrete structures. In: Proceedings of the ICML, pp 315–322
Kulis B, Sustik M, Dhillon I (2006) Learning low-rank kernel matrices. In: ICML ’06: Proceedings of the 23rd international conference on Machine learning. ACM, New York, NY, USA, pp 505–512. http://www.doi.acm.org/10.1145/1143844.1143908
Vandenberghe L, Vandenberghe SB, Boyd S (1996) Semidefinite programming. SIAM Rev 38(1):49–95. http://www.jstor.org/stable/2132974
Google Scholar
Lafon S, Lee A (2006) Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. Pattern Anal Mach Intell, IEEE Trans 28(9):1393–1403. doi:10.1109/TPAMI.2006.184
Article Google Scholar
Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72
Google Scholar
Marie Szafranski YG, Rakotomamonjy A (2008) Composite kernel learning. In: ICML
Meila M, Shi J (2001) A random walks view of spectral segmentation
Micchelli CA, Pontil M (2005) Learning the kernel function via regularization. J Mach Learn Res 6:1099–1125
MathSciNet Google Scholar
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers K (1999) Fisher discriminant analysis with kernels, pp 41–48. doi:10.1109/NNSP.1999.788121
Muller KR, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. Neural Netw, IEEE Trans 12(2):181–201. doi:10.1109/72.914517
Article Google Scholar
Nadler B, Lafon S, Coifman R, Kevrekidis I (2007) Diffusion maps-a probabilistic interpretation for spectral embedding and clustering algorithms. http://www.dx.doi.org/10.1007/978-3-540-73750-6_10
Rakotomamonjy A, Bach F, Canu S, Grandvalet Y (2007) More efficiency in multiple kernel learning. In: ICML ’07: Proceedings of the 24th international conference on Machine learning. ACM, New York, NY, USA, pp 775–782. http://www.doi.acm.org/10.1145/1273496.1273594
Scholkopf B, Smola A, Muller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comp 10(5):1299–1319. http://neco.mitpress.org/cgi/content/abstract/10/5/1299
Google Scholar
Shaw B, Jebara T (2009) Structure preserving embedding. In: ICML ’09: Proceedings of the 26th annual international conference on machine learning. ACM, New York, NY, USA, pp 937–944.doi:10.1145/1553374.1553494
Smola A, Hofmann T, Scholkopf B (2007) Kernel methods in machine learning. Ann Stat 36:1171–1220. http://eprints.pascal-network.org/archive/00003984
Google Scholar
Society FRKCAM (1997) Spectral graph theory. Regional conference series in mathmatics. American Mathematical Society, Providence, RI
Terence Sim SB, Bsat M (2001) The cmu pose, illumination, and expression (pie) database of human faces. Technical Reports CMU-RI-TR-01-02, Robotics Institute, Pittsburgh, PA
Vapnik V (1999) An overview of statistical learning theory. Neural Netw, IEEE Trans 10(5):988–999. doi:10.1109/72.788640
Article Google Scholar
Weinberger KQ, Packer BD, Saul LK (2005) Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 381–388
Weinberger KQ, Saul LK (2006) An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: Unfolding, Proceedings of the 21st national conference on artificial intelligence. AAAI
Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: ICML ’04: Proceedings of the twenty-first international conference on Machine learning. ACM, New York, NY, USA, p 106. doi:10.1145/1015330.1015345
Xiong H, Swamy M, Ahmad M (2005) Optimizing the kernel in the empirical feature space. Neural Netw, IEEE Trans 16(2):460–474. doi:10.1109/TNN.2004.841784
Article Google Scholar
Yang MH, Ahuj N, Kriegman D (2000) Face recognition using kernel eigenfaces. In: Image Processing, 2000. Proceedings. 2000 International Conference on, vol 1, pp 37–40. doi:10.1109/ICIP.2000.900886
Zhang X, Lee WS (2006) Hyperparameter learning for graph based semi-supervised learning algorithms. In: NIPS. http://books.nips.cc/papers/files/nips19/NIPS2006_0148.pdf
Zhengdong Lu, Dhillon PJI (2009) Geometry-aware metric learning. In: ICML ’09: Proceedings of the 26nd international conference on Machine learning. ACM

Download references

Acknowledgments

This research has been made possible through the Science Fund Grant “Delineation and 3D Visualization of Tumor and Risk Structures” (DVTRS), No: 1001/PKOMP/817001 by the Ministry of Science, Technology and Innovation of Malaysia.

Author information

Authors and Affiliations

Computer Vision Research Group, School of Computer Sciences, Universiti Sains Malaysia, 11800, Penang, Malaysia
M. Ehsan Abbasnejad, Dhanesh Ramachandram & Rajeswari Mandava

Authors

M. Ehsan Abbasnejad
View author publications
You can also search for this author in PubMed Google Scholar
Dhanesh Ramachandram
View author publications
You can also search for this author in PubMed Google Scholar
Rajeswari Mandava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dhanesh Ramachandram.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abbasnejad, M.E., Ramachandram, D. & Mandava, R. An unsupervised approach to learn the kernel functions: from global influence to local similarity. Neural Comput & Applic 20, 703–715 (2011). https://doi.org/10.1007/s00521-010-0411-7

Download citation

Received: 01 December 2009
Accepted: 05 June 2010
Published: 20 June 2010
Issue Date: July 2011
DOI: https://doi.org/10.1007/s00521-010-0411-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An unsupervised approach to learn the kernel functions: from global influence to local similarity

Abstract

Access this article

Similar content being viewed by others

Manifold regularized multiple kernel learning with Hellinger distance

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Kernel Learning with Hilbert-Schmidt Independence Criterion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An unsupervised approach to learn the kernel functions: from global influence to local similarity

Abstract

Access this article

Similar content being viewed by others

Manifold regularized multiple kernel learning with Hellinger distance

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Kernel Learning with Hilbert-Schmidt Independence Criterion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation