Generalization Error and Training Error at Singularities of Multilayer Perceptrons

Amari, Shun-ichi; Ozeki, Tomoko; Park, Hyeyoung

doi:10.1007/3-540-45720-8_37

Shun-ichi Amari⁶,
Tomoko Ozeki⁶ &
Hyeyoung Park⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2084))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

1434 Accesses

Abstract

The neuromanifold or the parameter space of multila yer perceptrons includes complex singularities at which the Fisher information matrix degenerates. The parameters are unidentifiable at singularities, and this causes serious difficulties in learning, known as plateaus in the cost function. The natural or adaptive natural gradient method is proposed for overcoming this difficulty. It is important to study the relation betw een the generalization error and and the training error at the singularities, because the generalization error is estimated in terms of the training error. The generalization error is studied both for the maximum likelihood estimator (mle) and the Bayesian predictive distribution estimator in terms of the Gaussian random field, by using a simple model. This elucidates the strange behaviors of learning dynamics around singularities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amari, S.: Natural gradient works efficiently in learning, Neural Computation, 10, 251–276, 1998.
Article Google Scholar
Amari, S. and Murata, N.: Statistical theory of learning curves under entropic loss criterion Neural Computation, 5, 140–153, 1993.
Article Google Scholar
Amari S. and Nagaoka, H.: Information Geometry, AMS and Oxford University Press, 2000.
Google Scholar
Amari, S. and Ozeki, T.: Differential and algebraic geometry of multilayer perceptrons, IEICE Transactions on Fundamentals of Electronics, Communications and Computer System, E84-A, 31–38, 2001.
Google Scholar
Amari, S., Park, H., and Fukumizu, F.: Adaptive method of realizing natural gradient learning for multilayer perceptrons, Neural Computation, 12, 1399–1409, 2000.
Article Google Scholar
Dacunha-Castelle, D. and Gassiat, E.: Testing in locally conic models, and application to mixture models, Probability and Statistics, 1, 285–317, 1997.
Article MATH MathSciNet Google Scholar
Fukumizu, K.: Statistical analysis of unidentifiable models and its application to multilayer neural networks, Memo at Post-Conference of the Bernoulli-RIKEN BSI 2000 Symposium on Neural Networks and Learning, October 2000.
Google Scholar
Fukumizu, K.: Likelihood Ratio of Unidentifiable Models and Multilayer Neural Networks, Research Memorandum, 780, Inst. of Statitical Mathematics, 2001.
Google Scholar
Hagiwara, k., Kuno, K. and Usui, S.: On the problem in model selection of neural network regression in overrealizable scenario, Proceeding of International Joint Conference of Neural Networks, 2000.
Google Scholar
Hartigan, J. A.: A failure of likelihood asymptotics for normal mixtures, Proceedings of Berkeley Conference in Honor of J. Neyman and J. Kiefer, 2, 807–810, 1985.
Google Scholar
Kitahara, M., Hayasaka, T., Toda, N. and Usui, S.: On the probability distribution of estimators of regression model using 3-layered neural networks (in Japanese), Workshop on Information-Based Induction Sciences (IBIS 2000), 21–26, July, 2000
Google Scholar
Park, H., Amari, S. and Fukumizu, F.: Adaptive natural gradient learning algorithms for various stochastic models, Neural Networks, 13, 755–764, 2000.
Article Google Scholar
Rattray, M., Saad, D. and Amari S.: Natural Gradient Descent for On-line Learning, Physical Review Letters, 81, 5461–5464, 1998.
Article Google Scholar
Watanabe, S.: Algebraic analysis for non-identifiable learning machines, Neural Computation, to appear.
Google Scholar
Watanabe, S.: Training and generalization errors of learning machines with algebraic singularities (in Japanease), The Trans. of IEICE A, J84-A, 99–108,2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Mathematical Neuroscience, RIKEN Brain Science Institute, 2-1 Hirosawa, Wak o, Saitama, 351-0198, Japan
Shun-ichi Amari, Tomoko Ozeki & Hyeyoung Park

Authors

Shun-ichi Amari
View author publications
You can also search for this author in PubMed Google Scholar
Tomoko Ozeki
View author publications
You can also search for this author in PubMed Google Scholar
Hyeyoung Park
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Inteligencia Artificial Sanda del Rey, Universidad Nacional de Educación a Distancia, s/n., Madrid, 28040, Spain
José Mira
Departamento de Arquitectura y Tecnología de Computadores, Universidad de Granada, Campus Fuentenueva, 18071, Granada, Spain
Alberto Prieto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amari, Si., Ozeki, T., Park, H. (2001). Generalization Error and Training Error at Singularities of Multilayer Perceptrons. In: Mira, J., Prieto, A. (eds) Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence. IWANN 2001. Lecture Notes in Computer Science, vol 2084. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45720-8_37

Download citation

DOI: https://doi.org/10.1007/3-540-45720-8_37
Published: 12 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42235-8
Online ISBN: 978-3-540-45720-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics