Abstract
Complexity-penalization strategies are one way to decide on the most appropriate network size in order to address the trade-off between overfitted and underfitted models. In this paper we propose a new penalty term derived from the behaviour of candidate models under noisy conditions that seems to be much more robust against catastrophic overfitting errors that standard techniques. This strategy is applied to several regression problems using polynomial functions, univariate autoregressive models and RBF neural networks. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to state-of-the-art criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abu-Mostafa, Y. (1989), The Vapnik-Chervonenkis dimension: Information versus complexity in learning, Neural Computation 1(3), 312–317.
Akaike, H Information theory and an extension of the maximum likelihood principle. In B.N. Petrov and C saki ed. 2nd Intl. Symp. Inform. Theory 267–281, Budapest. (1973).
Amari, S. (1995), Learning and statistical inference, in M.A. Arbib, ed., ‘The Handbook of Brain Theory and Neural Networks’, MIT Press, Cambridge, Massachusetts.
Foster, P.D. Characterizing the generalization performance of model selection strategies. Proceeding of the 14th Intl. Conf. on Machine Learning (ICML-97) Nashville, (1997).
Hurvich, C.M. and Tsai, C.L. Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989).
McQuarrie A.D.R. Shumway, R.H. and Tsai C.L. The model selection criterion AICu. Statistical and Probability letters 34, 285–292. (1997).
Moody, J. (1992), The effective number of parameters: An analysis of generalization and regularization in nonlinear learning systems, in J. Moody et Al. eds, ‘Advances in Neural Information Processing Systems’, Vol. 4, Morgan Kaufmann, pp. 847–854.
Ripley, B. (1995), Statistical ideas for selecting network architectures, Invited Presentation, Neural Information Processing Systems 8.
Schwarz, G. Estimating the dimension of a model. Annals of Statistics, 6, 461–515.(1978)
Sugiura, N. Further analysis of the data by Akaike’s information criterion and the finite corrections. Communications in Statistic-Theory and Methods 7,13–26. (1978).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Junquera, J., Galindo Riaño, P., Guerrero Vázquez, E., Yañez Escolano1, A. (2001). A Penalization Criterion Based on Noise Behaviour for Model Selection. In: Mira, J., Prieto, A. (eds) Bio-Inspired Applications of Connectionism. IWANN 2001. Lecture Notes in Computer Science, vol 2085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45723-2_18
Download citation
DOI: https://doi.org/10.1007/3-540-45723-2_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42237-2
Online ISBN: 978-3-540-45723-7
eBook Packages: Springer Book Archive