Skip to main content

Statistical analysis of regularization constant — From Bayes, MDL and NIC points of view

  • Formal Tools and Computational Models of Neurons and Neural Net Architectures
  • Conference paper
  • First Online:
Biological and Artificial Computation: From Neuroscience to Technology (IWANN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

Abstract

In order to avoid overfitting in neural learning, a regularization term is added to the loss function to be minimized. It is naturally derived from the Bayesian standpoint. The present paper studies how to determine the regularization constant from the points of view of the empirical Bayes approach, the maximum description length (MDL) approach, and the network information criterion (NIC) approach. The asymptotic statistical analysis is given to elucidate their differences. These approaches are tightly connected with the method of model selection. The superiority of the NIC is shown from this analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. H. Akaike (1974) A new look at statistical model identification, IEEE Transactions on Automatic Control, 19, 716–723.

    Google Scholar 

  2. G. Brake, J.N. Kok and P.M.B. Vitányi (1995) Model Selection for Neural Networks: Comparing MDL and NIC, NeuroCOLT Technical Report NC-TR-95-021.

    Google Scholar 

  3. D.J.C. McKay (1992) Bayesian interpolation, Neural Computation, 4, 415–447.

    Google Scholar 

  4. J.E. Moody (1992) The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems, in NIPS4, pp.847–854.

    Google Scholar 

  5. N. Murata, S. Yoshizawa and S. Amari (1994) Network information criterion — determining the number of hidden units for artificial neural network models, IEEE Transactions on Neural Networks, 5, 865–872.

    Google Scholar 

  6. T. Poggio and F. Girosi (1990) Regularization algorithms for learning that are equivalent to multilayer networks, Science, 247, 978–982.

    Google Scholar 

  7. B.D. Ripley (1996) Pattern Recognition and Neural Networks, Cambridge University Press.

    Google Scholar 

  8. J.Rissanen (1989) Stochastic Complexity in Statistical Inquiry, Singapore: World Scientific Publishing Co.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amari, Si., Murata, N. (1997). Statistical analysis of regularization constant — From Bayes, MDL and NIC points of view. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032486

Download citation

  • DOI: https://doi.org/10.1007/BFb0032486

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63047-0

  • Online ISBN: 978-3-540-69074-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics