Abstract
We review recent results about the maximal values of the Kullback-Leibler information divergence from statistical models defined by neural networks, including naïve Bayes models, restricted Boltzmann machines, deep belief networks, and various classes of exponential families. We illustrate approaches to compute the maximal divergence from a given model starting from simple sub- or super-models. We give a new result for deep and narrow belief networks with finite-valued units.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ay, N., Knauf, A.: Maximizing multi-information. Kybernetika 42, 517–538 (2006)
Ay, N., Montúfar, G., Rauh, J.: Selection criteria for neuromanifolds of stochastic dynamics. In: Advances in Cognitive Neurodynamics (III). Springer (2013)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Technical report, Department of computer Science, Tufts University, Medford, MA (1988)
Funahashi, K.: Multilayer neural networks and Bayes decision theory. Neural Networks 11(2), 209–213 (1998)
Hornik, K., Stinchcombe, M.B., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Juríček, J.: Maximization of information divergence from multinomial distributions. Acta Universitatis Carolinae 52(1) (2011)
Le Roux, N., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6), 1631–1649 (2008)
Le Roux, N., Bengio, Y.: Deep belief networks are compact universal approximators. Neural Computation 22, 2192–2207 (2010)
Matúš, F., Ay, N.: On maximization of the information divergence from an exponential family. In: Proceedings of the WUPES 2003, pp. 199–204 (2003)
Matúš, F.: Maximization of information divergences from binary i.i.d. sequences. In: Proceedings IPMU, pp. 1303–1306 (2004)
Montúfar, G.: Mixture decompositions of exponential families using a decomposition of their sample spaces. Kybernetika 49(1), 23–39 (2013)
Montúfar, G.: Universal approximation depth and errors of narrow belief networks with discrete units (2013). Preprint available at http://arxiv.org/abs/1303.7461
Montúfar, G., Ay, N.: Refinements of universal approximation results for DBNs and RBMs. Neural Computation 23(5), 1306–1319 (2011)
Montúfar, G., Morton, J.: Kernels and submodels of deep belief networks (2012). Preprint available at http://arxiv.org/abs/1211.0932
Montúfar, G., Morton, J.: Discrete restricted Boltzmann machines (2013). Preprint available at http://arxiv.org/abs/1301.3529
Montúfar, G., Rauh, J.: Scaling of model approximation errors and expected entropy distances. In: Proceedings of the WUPES 2012, pp. 137–148 (2012)
Montúfar, G., Rauh, J., Ay, N.: Expressive power and approximation errors of restricted Boltzmann machines. In: Advances in NIPS 24, pp. 415–423 (2011)
Rauh, J.: Finding the maximizers of the information divergence from an exponential family. IEEE Transactions on Information Theory 57(6), 3236–3247 (2011)
Rauh, J.: Optimally approximating exponential families. Kybernetika 49(2), 199–215 (2013)
Sutskever, I., Hinton, G.E.: Deep narrow sigmoid belief networks are universal approximators. Neural Computation 20, 2629–2636
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Montúfar, G., Rauh, J., Ay, N. (2013). Maximal Information Divergence from Statistical Models Defined by Neural Networks. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2013. Lecture Notes in Computer Science, vol 8085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40020-9_85
Download citation
DOI: https://doi.org/10.1007/978-3-642-40020-9_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40019-3
Online ISBN: 978-3-642-40020-9
eBook Packages: Computer ScienceComputer Science (R0)