Abstract
To decide the optimal size of learning machines is a central issue in the statistical learning theory, and that is why some theoretical criteria such as the BIC are developed. However, they cannot be applied to singular machines, and it is known that many practical learning machines e.g. mixture models, hidden Markov models, and Bayesian networks, are singular. Recently, we proposed the Singular Information Criterion (SingIC), which allows us to select the optimal size of singular machines. The SingIC is based on the analysis of the learning coefficient. So, the machines, to which the SingIC can be applied, are still limited. In this paper, we propose an extension of this criterion, which enables us to apply it to many singular machines, and evaluate the efficiency in Gaussian mixtures. The results offer an effective strategy to select the optimal size.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Amari, S., Ozeki, T.: Differential and algebraic geometry of multilayer perceptrons. IEICE Trans E84-A 1, 31–38 (2001)
Hartigan, J.A.: A failure of likelihood asymptotics for normal mixtures. In: Proc. of the Berkeley Conference in Honor of J.Neyman and J.Kiefer, vol. 2, pp. 807–810 (1985)
Akaike, H.: A new look at the statistical model identification. IEEE Trans. on Automatic Control 19, 716–723 (1974)
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6(2), 461–464 (1978)
Rissanen, J.: Stochastic complexity and modeling. Annals of Statistics 14, 1080–1100 (1986)
Watanabe, S.: Algebraic analysis for non-identifiable learning machines. Neural Computation 13(4), 899–933 (2001)
Yamazaki, K., Nagata, K., Watanabe, S.: A new method of model selection based on learning coefficient. In: Proc. of International Symposium on Nonlinear Theory and its Applications, pp. 389–392 (2005)
Aoyagi, M., Watanabe, S.: The generalization error of reduced rank regression in bayesian estimation. In: Proc. of ISITA, pp. 1068–1073 (2004)
Yamazaki, K., Watanabe, S.: Learning coefficient of hidden markov models. Technical Report of IEICE NC2005-14, 37–42 (2005)
Hosino, T., Watanabe, K., Watanabe, S.: Stochastic complexity of variational bayesian hidden markov models. In: Proc. of International Joint Conference on Neural Networks, pp. 1114–1119 (2005)
Watanabe, K., Watanabe, S.: Variational bayesian stochastic complexty of mixture models. MIT press, Cambridge (to appear)
Watanabe, S., Yamazaki, K., Aoyagi, M.: Kullback information of normal mixture is not an analytic function. Technical Report of IEICE NC2004-50, 41–46 (2004) (in Japanese)
Yamazaki, K., Watanabe, S.: Stochastic complexity of bayesian networks. In: Proc. of UAI, pp. 592–599 (2003)
Ogata, Y.: A monte carlo method for an objective bayesian procedure. Ann. Inst. Statis. Math. 42(3), 403–433 (1990)
Hukushima, K., Nemoto, K.: Exchange monte carlo method and application to spin glass simulations. Journal of Physical Society of Japan 65(6), 1604–1608 (1996)
Yamazaki, K., Watanabe, S.: Singularities in complete bipartite graph-type boltzmann machines and upper bounds of stochastic complexities. IEEE Trans. On Neural Networks 16(2), 312–324 (2005)
Yamazaki, K., Watanabe, S.: Generalization errors in estimation of stochastic context-free grammar. In: The IASTED International Conference on ASC, pp. 183–188 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yamazaki, K., Nagata, K., Watanabe, S., Müller, KR. (2006). A Model Selection Method Based on Bound of Learning Coefficient. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840930_38
Download citation
DOI: https://doi.org/10.1007/11840930_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38871-5
Online ISBN: 978-3-540-38873-9
eBook Packages: Computer ScienceComputer Science (R0)