Abstract
Measures of statistical independence between random variables have been successfully applied in many learning tasks, such as independent component analysis, feature selection and clustering. The success is based on the fact that many existing learning tasks can be cast into problems of dependence maximization (or minimization). Motivated by this, we introduce a unifying view of kernel learning with the Hilbert-Schmidt independence criterion (HSIC) which is a kernel method for measuring the statistical dependence between random variables. The key idea is that good kernels should maximize the statistical dependence, measured by the HSIC, between the kernels and the class labels. As a special case of kernel learning, we also propose an effective Gaussian kernel optimization method for classification by maximizing the HSIC, where the spherical kernel is considered. The proposed approach is demonstrated with several popular UCI machine learning benchmark examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fukumizu, K., Gretton, A., Sun, X., Schölkopf, B.: Kernel measures of conditional dependence. In: Advances in Neural Information Processing Systems 20, pp. 489–496 (2007)
Shawe-Taylor, J., Cristianini, N.: Kernel methods for pattern analysis. Cambridge University Press, New York (2004)
Chwialkowski, K., Gretton, A.: A kernel independence test of random process. In: Proceedings of the 31th International Conference on Machine Learning, Beijing, China, pp. 1422–1430 (2014)
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Proceedings of the 16th International Conference on Algorithmic Learning Theory, Singapore, pp. 63–77 (2005)
Song, L., Smola, A., Gretton, A., Borgwardt, K.: A dependence maximization view of clustering. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis, USA, pp. 823–830 (2007)
Song, L., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 13, 1393–1434 (2012)
Wang, T., Zhao, D., Tian, S.: An overview of kernel alignment and its applications. Artif. Intell. Rev. 43(2), 179–192 (2015)
Keerthi, S.S.: Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. IEEE Trans. Neural Netw. 13(5), 1225–1229 (2002)
Gönen, M., Alpayın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2226 (2011)
Wang, T., Tian, S., Huang, H., Deng, D.: Learning by local kernel polarization. Neurocomputing 72(13–15), 3077–3084 (2009)
Cristianini, N., Shawe-Taylor, J., Elisseeff, A., Kandola, J.: On kernel-target alignment. In: Advances in Neural Information Processing Systems 14, pp 367–373 (2001)
Cortes, C., Mohri, M., Rostamizadeh, A.: Algorithms for learning kernels based on centered alignment. J. Mach. Learn. Res. 13, 795–828 (2012)
Steinwart, I.: On the influence of the kernels on the consistency of support vector machines. J. Mach. Learn. Res. 2, 67–93 (2001)
Sugiyama, M.: On kernel parameter selection in Hilbert-Schmidt independence criterion. IEICE Trans. Inf. Syst. E95-D(10), 2564–2567 (2012)
Lu, Y., Wang, L., Lu, J., Yang, J., Shen, C.: Multiple kernel clustering based on centered kernel alignment. Pattern Recogn. 47(11), 3656–3664 (2014)
Neumann, J., Schnörr, C., Steidl, G.: Combined SVM-based feature selection and classification. Mach. Learn. 61(1–3), 129–150 (2005)
Lichman, M.: UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science (2013). http://archive.ics.uci.edu/ml/
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), Article No. 27 (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvm
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S., Zhang, G.: Transfer learning using computational intelligence: a survey. Knowl.-Based Syst. 80, 14–23 (2015)
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China (No. 61562003) and the Natural Science Foundation of Jiangxi Province of China (Nos. 20151BAB207029 and 20161BAB202070).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, T., Li, W., He, X. (2016). Kernel Learning with Hilbert-Schmidt Independence Criterion. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 662. Springer, Singapore. https://doi.org/10.1007/978-981-10-3002-4_58
Download citation
DOI: https://doi.org/10.1007/978-981-10-3002-4_58
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3001-7
Online ISBN: 978-981-10-3002-4
eBook Packages: Computer ScienceComputer Science (R0)