Skip to main content

Bayesian Networks and Inner Product Spaces

  • Conference paper
Learning Theory (COLT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Included in the following conference series:

Abstract

In connection with two-label classification tasks over the Boolean domain, we consider the possibility to combine the key advantages of Bayesian networks and of kernel-based learning systems. This leads us to the basic question whether the class of decision functions induced by a given Bayesian network can be represented within a low-dimensional inner product space. For Bayesian networks with an explicitly given (full or reduced) parameter collection, we show that the “natural” inner product space has the smallest possible dimension up to factor 2 (even up to an additive term 1 in many cases). For a slight modification of the so-called logistic autoregressive Bayesian network with n nodes, we show that every sufficiently expressive inner product space has dimension at least 2n/4. The main technical contribution of our work consists in uncovering combinatorial and algebraic structures within Bayesian networks such that known techniques for proving lower bounds on the dimension of inner product spaces can be brought into play.

This work has been supported in part by the Deutsche Forschungsgemeinschaft Grant SI 498/7-1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden Markov support vector machines. In: Proceedings of the 20th International Conference on Machine Learning, pp. 3–10. AAAI Press, Menlo Park (2003)

    Google Scholar 

  2. Arriaga, R.I., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. In: Proceedings of the 40th Annual Symposium on the Foundations of Computer Science, pp. 616–623 (1999)

    Google Scholar 

  3. Ben-David, S., Eiron, N., Simon, H.U.: Limitations of learning via embeddings in euclidean half-spaces. Journal of Machine Learning Research 3, 441–461 (2002); An extended abstract of this paper appeared in the Proceedings of the 14th Annual Conference on Computational Learning Theory (COLT 2001)

    Google Scholar 

  4. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152. ACM Press, New York (1992)

    Chapter  Google Scholar 

  5. Chickering, D.M., Heckerman, D., Meek, C.: A Bayesian approach to learning Bayesian networks with local structure. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 80–89. Morgan Kaufman, San Francisco (1997)

    Google Scholar 

  6. Cover, T.M.: Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers 14, 326–334 (1965)

    Article  MATH  Google Scholar 

  7. Mc Cullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall, Boca Raton (1983)

    Google Scholar 

  8. Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, Heidelberg (1996)

    MATH  Google Scholar 

  9. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley–Interscience. John Wiley & Sons, New York (1973)

    Google Scholar 

  10. Forster, J.: A linear lower bound on the unbounded error communication complexity. Journal of Computer and System Sciences 65(4), 612–625 (2002); An extended abstract of this paper appeared in the Proceedings of the 16th Annual Conference on Computational Complexity (CCC 2001)

    Google Scholar 

  11. Forster, J., Krause, M., Lokam, S.V., Mubarakzjanov, R., Schmitt, N., Simon, H.U.: Relations between communication complexity, linear arrangements, and computational complexity. In: Proceedings of the 21st Annual Conference on the Foundations of Software Technology and Theoretical Computer Science, pp. 171–182 (2001)

    Google Scholar 

  12. Forster, J., Schmitt, N., Simon, H.U., Suttorp, T.: Estimating the optimal margins of embeddings in euclidean half spaces. Machine Learning 51(3), 263–281 (2003); An extended abstract of this paper appeared in the Proceedings of the 14th Annual Conference on Computational Learning Theory (COLT 2001)

    Google Scholar 

  13. Forster, J., Simon, H.U.: On the smallest possible dimension and the largest possible margin of linear arrangements representing given concept classes. In: Proceedings of the 13th International Workshop on Algorithmic Learning Theory, pp. 128–138 (2002)

    Google Scholar 

  14. Frankl, P., Maehara, H.: The Johnson-Lindenstrauss lemma and the sphericity of some graphs. Journal of Combinatorial Theory (B) 44, 355–362 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  15. Frey, B.J.: Graphical Models for Machine Learning and Digital Communication. MIT Press, Cambridge (1998)

    Google Scholar 

  16. Hajnal, A., Maass, W., Pudlák, P., Szegedy, M., Turán, G.: Threshold circuits of bounded depth. Journal of Computer and System Sciences 46, 129–1154 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  17. Jaakkola, T.S., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, vol. 11, pp. 487–493. MIT Press, Cambridge (1998)

    Google Scholar 

  18. Jaakkola, T.S., Haussler, D.: Probabilistic kernel regression models. In: Proceedings of the 7th International Workshop on AI and Statistics, Morgan Kaufman, San Francisco (1999)

    Google Scholar 

  19. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipshitz mapping into Hilbert spaces. Contemp. Math. 26, 189–206 (1984)

    MATH  MathSciNet  Google Scholar 

  20. Kiltz, E.: On the representation of boolean predicates of the Diffie-Hellman function. In: Proceedings of 20th International Symposium on Theoretical Aspects of Computer Science, pp. 223–233 (2003)

    Google Scholar 

  21. Kiltz, E., Simon, H.U.: Complexity theoretic aspects of some cryptographic functions. In: Proceedings of the 9th International Conference on Computing and Combinatorics, pp. 294–303 (2003)

    Google Scholar 

  22. Maass, W., Schnitger, G., Sontag, E.D.: A comparison of the computational power of sigmoid and Boolean theshold circuits. In: Roychowdhury, V., Siu, K.-Y., Orlitsky, A. (eds.) Theoretical Advances in Neural Computation and Learning, pp. 127–151. Kluwer Academic Publishers, Dordrecht (1994)

    Google Scholar 

  23. Neal, R.M.: Connectionist learning of belief networks. Artificial Intelligence 56, 71–113 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  24. Oliver, N., Schölkopf, B., Smola, A.J.: Natural regularization from generative models. In: Smola, A.J., Bartlett, P.L., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 51–60. MIT Press, Cambridge (2000)

    Google Scholar 

  25. Pearl, J.: Reverend Bayes on inference engines: A distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence, pp. 133–136. AAAI Press, Menlo Park (1982)

    Google Scholar 

  26. Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research 4, 61–76 (1996)

    MATH  Google Scholar 

  27. Saunders, C., Shawe-Taylor, J., Vinokourov, A.: String kernels, Fisher kernels and finite state automata. In: Advances in Neural Information Processing Systems 15, MIT Press, Cambridge (2002)

    Google Scholar 

  28. Schmitt, M.: On the complexity of computing and learning with multiplicative neural networks. Neural Computation 14, 241–301 (2002)

    Article  MATH  Google Scholar 

  29. Spiegelhalter, D.J., Knill-Jones, R.P.: Statistical and knowledge-based approaches to clinical decision support systems. Journal of the Royal Statistical Society, 35–77 (1984)

    Google Scholar 

  30. Tsuda, K., Akaho, S., Kawanabe, M., Müller, K.-R.: Asymptotic properties of the Fisher kernel. Neural Computation (2003) (to appear)

    Google Scholar 

  31. Tsuda, K., Kawanabe, M.: The leave-one-out kernel. In: Proceedings of the International Conference on Artificial Neural Networks, pp. 727–732. Springer, Heidelberg (2002)

    Google Scholar 

  32. Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.R.: A new discriminative kernel from probabilistic models. Neural Computation 14(10), 2397–2414 (2002)

    Article  MATH  Google Scholar 

  33. Vapnik, V.: Statistical Learning Theory. In: Wiley Series on Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, Chichester (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nakamura, A., Schmitt, M., Schmitt, N., Simon, H.U. (2004). Bayesian Networks and Inner Product Spaces. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27819-1_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22282-8

  • Online ISBN: 978-3-540-27819-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics