Abstract
Three layer feed-forward neural network (3-LFFNN) has been widely used for nonlinear regression. It is well known that its hidden layer can be regarded as taking the role of feature extraction and dimension reduction, and that the regression performance relies on how the feature dimension or equivalently the number of hidden units is determined appropriately. There are many publications on determining the hidden unit number for a desired generalization error. However, few comparative studies have been made on different approaches proposed, especially on those typical model selection criteria for this purpose. This paper targets such an aim. Using both simulated data and several real world data sets, a comparative study has been made on the regression performances with the number of hidden units determined by several typical model selection criteria, including Akaike’s Information Criterion (AIC), the consistent Akaike’s information criterion (CAIC), Schwarz’s Bayesian Inference Criterion (BIC) which coincides with Rissanen’s Minimum Description Length (MDL) criterion, and the well-known technique cross-validation (CV), as well as the Bayesian Ying-Yang harmony criterion on a small sample size (BYY-S). As shown in experiments on a small size of samples, BIC and CV are better than AIC and CAIC obviously. Moreover, BIC may be better than CV on certain data sets, while CV may be better than BIC on other data sets. Interestingly, BYY-S generally outperforms both BIC and CV.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arai, M.: Mapping abilities of three-layer neural networks. In: Proc. IJCNN 1989, vol. 1, pp. 419–423 (1989)
Hayasaka, T., Hagiwara, K., Toda, N., Usui, S.: Determination of the number of hidden units from a statisticalviewpoint. In: Proc. ICONIP 1999, vol. 1, pp. 240–245 (1999)
Rumelhart, D.E., et al.: Learning internal representations by error propagation. Parallel Distributed Processing, vol. 1. MIT Press, Cambridge (1986)
Neal, R.M.: Bayesian learning for neural networks. Springer, New York (1996)
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Automatic Control 19, 714–723 (1974)
Bozdogan, H.: Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52(3), 345–370 (1987)
Schwarz, G.: Estimating the dimension of a model. Annl. of Stat. 6, 461–464 (1978)
Rissanen, J.: Modeling by shortest data description. Automation 14, 456–471 (1978)
Moody, J.E.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Moody, J.E., et al. (eds.) Advances in NIPS, vol. 4, pp. 847–854. MIT Press, Cambridge (1992)
Stone, M.: Cross validation choice and assessment of statistical predictions. Journal of the Royal Statistical Society B36, 111–147 (1974)
MacKay, D.J.C.: A practical bayesian framework for backpropagation networks. Neural Computation 4(3), 448–472 (1992)
Brake, G., Kok, J.N., Vit’anyi, P.M.B.: Model selection for neural networks: Comparing MDL and NIC. In: Proc. European Symposium on Artificial Neural Networks, Brussels, Belgium, pp. 31–36 (1994)
Xu, L., Klasa, S., Yuille, A.L.: Recent Advances on Techniques Static Feedforward Networks with Supervised Learning. International Journal of Neural Systems 3(3), 253–290 (1992)
Xu, L.: BYY learning, regularized implementation, and model selection on modular networks with one hidden layer of binary units. Neurocomputing 51, 277–301 (2003)
Xu, L.: Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor auto-determination. IEEE Trans. Neural Networks 15(5), 885–902 (2004)
Xu, L.: Trends on Regularization and Model Selection in Statistical Learning: A Perspective from Bayesian Ying Yang Learning. In: Duch, W., Mandziuk, J., Zurada, J.M. (eds.) Challenges to Computational Intelligence (in press). the Springers series - Studies in Computational Intelligence. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shi, L., Xu, L. (2006). Comparative Investigation on Dimension Reduction and Regression in Three Layer Feed-Forward Neural Network. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840817_6
Download citation
DOI: https://doi.org/10.1007/11840817_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38625-4
Online ISBN: 978-3-540-38627-8
eBook Packages: Computer ScienceComputer Science (R0)