Bhattacharyya and Expected Likelihood Kernels

Jebara, Tony; Kondor, Risi

doi:10.1007/978-3-540-45167-9_6

Tony Jebara⁸ &
Risi Kondor⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

5369 Accesses
46 Citations
3 Altmetric

Abstract

We introduce a new class of kernels between distributions. These induce a kernel on the input space between data points by associating to each datum a generative model fit to the data point individually. The kernel is then computed by integrating the product of the two generative models corresponding to two data points. This kernel permits discriminative estimation via, for instance, support vector machines, while exploiting the properties, assumptions, and invariances inherent in the choice of generative model. It satisfies Mercer’s condition and can be computed in closed form for a large class of models, including exponential family models, mixtures, hidden Markov models and Bayesian networks. For other models the kernel can be approximated by sampling methods. Experiments are shown for multinomial models in text classification and for hidden Markov models for protein sequence classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aherne, F., Thacker, N., Rockett, P.: The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 32(4), 1–7 (1997)
MathSciNet Google Scholar
Barndorff-Nielsen, O.: Information and Exponential Families in Statistical Theory. John Wiley & Sons, Chichester (1978)
MATH Google Scholar
Bengio, Y., Frasconi, P.: Input-output HMM’s for sequence processing. IEEE Transactions on Neural Networks 7(5), 1231–1249 (1996)
Article Google Scholar
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. (1943)
Google Scholar
Bishop, C.: Neural Networks for Pattern Recognition. Oxford Press, Oxford (1996)
MATH Google Scholar
Collins, M., Duffy, N.: Convolution kernels for natural language. Neural Information Processing Systems 14 (2002)
Google Scholar
Cortes, C., Haffner, P., Mohri, M.: Rational kernels. In: Neural Information Processing Systems, vol. 15 (2002)
Google Scholar
Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural network architectures. Neural Computation 7, 219–269 (1995)
Article Google Scholar
Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSCCRL- 9-10, University of California at Santa Cruz (1999)
Google Scholar
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Neural Information Processing Systems, vol. 11 (1998)
Google Scholar
Jaakkola, T., Meila, M., Jebara, T.: Maximum entropy discrimination. In: Neural Information Processing Systems, vol. 12 (1999)
Google Scholar
Joachims, T., Cristianini, N., Shawe-Taylor, J.: Composite kernels for hypertext categorisation. In: International Conference on Machine Learning (2001)
Google Scholar
Jordan, M.: Learning in Graphical Models. Kluwer Academic, Dordrecht (1997)
Google Scholar
Kin, T., Tsuda, K., Asai, K.: Marginalized kernels for rna sequence data analysis. In: Proc. Genome Informatics (2002)
Google Scholar
Kondor, R., Jebara, T.: A kernel between sets of vectors. Machine Learning: 10th International Conference. In: ICML 2003 (February 2003)
Google Scholar
Lafferty, J., Lebanon, G.: Information diffusion kernels. In: Neural Information Processing Systems (2002)
Google Scholar
Leslie, C., Eskin, E., Weston, J., Noble, W.S.: Mismatch string kernels for svm protein classification. In: Neural Information Processing Systems (2002)
Google Scholar
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)
Article MATH Google Scholar
Lyngso, R.B., Pedersen, C.N.S., Nielsen, H.: Metrics and similarity measures for hidden markov models. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB) (1999)
Google Scholar
Ong, C., Smola, A., Williamson, R.: Superkernels. In: Neural Information Processing Systems (2002)
Google Scholar
Rathinavelu, C., Deng, L.: Speech trajectory discrimination using the minimum classification error learning. In: IEEE Trans. on Speech and Audio Processing (1997)
Google Scholar
Smola, A.J., Scholkopf, B.: From regularization operators to support vector machines. In: Neural Information Processing Systems, pp. 343–349 (1998)
Google Scholar
Tishby, N., Bialek, W., Pereira, F.: The information bottleneck method: Extracting relevant information from concurrent data. Technical report, NEC Research Institute (1998)
Google Scholar
Topsoe, F.: Some inequalities for information divergence and related measures of discrimination. J. of Inequalities in Pure and Applied Mathematics 2(1) (1999)
Google Scholar
Vishawanathan, S.V.N., Smola, A.J.: Fast kernels for string and tree matching. In: Neural Information Processing Systems, vol. 15 (2002)
Google Scholar
Watkins, C.: Dynamic Alignment Kernels. In: Watkins, C. (ed.) Advances in kernel methods. MT Press, Cambridge (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Columbia University, New York, NY, 10027, USA
Tony Jebara & Risi Kondor

Authors

Tony Jebara
View author publications
You can also search for this author in PubMed Google Scholar
Risi Kondor
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MPI for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf
University of California, Santa Cruz
Manfred K. Warmuth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jebara, T., Kondor, R. (2003). Bhattacharyya and Expected Likelihood Kernels. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-45167-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics