From Black-Box to White-Box: Interpretable Learning with Kernel Machines

Zhang, Hao; Nakadai, Shinji; Fukumizu, Kenji

doi:10.1007/978-3-319-96136-1_18

Hao Zhang¹³,
Shinji Nakadai¹³ &
Kenji Fukumizu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10934))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

Abstract

We present a novel approach to interpretable learning with kernel machines. In many real-world learning tasks, kernel machines have been successfully applied. However, a common perception is that they are difficult to interpret by humans due to the inherent black-box nature. This restricts the application of kernel machines in domains where model interpretability is highly required. In this paper, we propose to construct interpretable kernel machines. Specifically, we design a new kernel function based on random Fourier features (RFF) for scalability, and develop a two-phase learning procedure: in the first phase, we explicitly map pairwise features to a high-dimensional space produced by the designed kernel, and learn a dense linear model; in the second phase, we extract an interpretable data representation from the first phase, and learn a sparse linear model. Finally, we evaluate our approach on benchmark datasets, and demonstrate its usefulness in terms of interpretability by visualization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Major advancements in kernel function approximation

Article 01 August 2020

Continuous Kernel Learning

Nyström-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent

References

Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall/CRC, Boca Raton (1990)
MATH Google Scholar
Lipton, Z.C.: The mythos of model interpretability. In: ICML 2016 Workshop on Human Interpretability in Machine Learning (WHI2016) (2016)
Google Scholar
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2015), pp. 1721–1730. ACM (2015)
Google Scholar
Zeng, J., Ustun, B., Rudin, C.: Interpretable classification models for recidivism prediction. J. Royal Stat. Soc. Ser. A (Stat. Soc.) 180, 689–722 (2016)
Article MathSciNet Google Scholar
Letham, B., Letham, L.M., Rudin, C.: Bayesian inference of arrival rate and substitution behavior from sales transaction data with stockouts. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2016), pp. 1695–1704. ACM, New York (2016)
Google Scholar
Breiman, L.: Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16(3), 199–231 (2001)
Article MathSciNet Google Scholar
Chang, B.H.W.: Kernel machines are not black boxes - on the interpretability of kernel-based nonparametric models. Ph.D. thesis, University of Toronto (2014)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Vapnik, V., Golowich, S.E., Smola, A.J.: Support vector method for function approximation, regression estimation and signal processing. In Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems (NIPS 1996), vol. 9, pp. 281–287. MIT Press (1997)
Google Scholar
Van Belle, V., Lisboa, P.: White box radial basis function classifiers with component selection for clinical prediction models. Artif. Intell. Med. 60(1), 53–64 (2014)
Article Google Scholar
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) Advances in Neural Information Processing Systems (NIPS 2007), vol. 20, pp. 1177–1184. Curran Associates, Inc. (2008)
Google Scholar
Lou, Y., Caruana, R., Gehrke, J.: Intelligible models for classification and regression. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2012), pp. 150–158. ACM (2012)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), pp. 1135–1144. ACM (2016)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. Ser. B Methodol. 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Zhu, J., Rosset, S., Tibshirani, R., Hastie, T.J.: 1-norm support vector machines. In: Thrun, S., Saul, L.K., Schölkopf, P.B. (eds.) Advances in Neural Information Processing Systems (NIPS 2003), vol. 16, pp. 49–56. MIT Press (2004)
Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)
Article Google Scholar
Ravikumar, P., Lafferty, J., Liu, H., Wasserman, L.: Sparse additive models. J. Royal Stat. Soc. Ser. B (Stat. Methodol.) 71(5), 1009–1030 (2009)
Article MathSciNet Google Scholar
Bochner, S.: Lectures on Fourier Integrals. Princeton University Press, Princeton (1959)
MATH Google Scholar
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems (NIPS 2000), vol. 13, pp. 682–688. MIT Press (2001)
Google Scholar
Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)
MathSciNet MATH Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Royal Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Article MathSciNet Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd edn. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Book MATH Google Scholar
Zhao, T., Liu, H.: Sparse additive machine. In: Lawrence, N.D., Girolami, M.A. (eds.) Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012), pp. 1435–1443 (2012)
Google Scholar
Yamada, M., Jitkrittum, W., Sigal, L., Xing, E.P., Sugiyama, M.: High-dimensional feature selection by feature-wise non-linear lasso. Neural Comput. 26(1), 185–207 (2014)
Article MathSciNet Google Scholar
Lou, Y., Caruana, R., Gehrke, J., Hooker, G.: Accurate intelligible models with pairwise interactions. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2013), pp. 623–631. ACM (2013)
Google Scholar
Pace, R.K., Barry, R.: Sparse spatial autoregressions. Stat. Probab. Lett. 33(3), 291–297 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

NEC Corporation, Tokyo, Japan
Hao Zhang & Shinji Nakadai
The Institute of Statistical Mathematics, Tokyo, Japan
Kenji Fukumizu

Authors

Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shinji Nakadai
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Fukumizu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Zhang .

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Nakadai, S., Fukumizu, K. (2018). From Black-Box to White-Box: Interpretable Learning with Kernel Machines. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10934. Springer, Cham. https://doi.org/10.1007/978-3-319-96136-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-96136-1_18
Published: 08 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96135-4
Online ISBN: 978-3-319-96136-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics