Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network

Kuri-Morales, Angel

doi:10.1007/s00500-016-2416-3

Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network

Focus
Published: 02 November 2016

Volume 21, pages 597–609, (2017)
Cite this article

Soft Computing Aims and scope Submit manuscript

Angel Kuri-Morales¹

742 Accesses
11 Citations
Explore all metrics

Abstract

Multi-layered perceptron networks (MLP) have been proven to be universal approximators. However, to take advantage of this theoretical result, we must determine the smallest number of units in the hidden layer. Two basic theoretically established requirements are that an adequate activation function be selected and a proper training algorithm be applied. We must also guarantee that (a) The training data compile with the demands of the universal approximation theorem (UAT) and (b) The amount of information present in the training data be determined. We discuss how to preprocess the data in order to meet such demands. Once this is done, a closed formula to determine H may be applied. Knowing H implies that any unknown function associated to the training data may, in practice, be arbitrarily approximated by a MLP. We take advantage of previous work where a complexity regularization approach tried to minimize the RMS training error. In that work, an algebraic expression of H is attempted by sequential trial-and-error. In contrast, here we find a closed formula \(H=f(m_{O}, N)\) where \(m_{O}\) is the number of units in the input layer and N is the effective size of the training data. The algebraic expression we derive stems from statistically determined lower bounds of H in a range of interest of the \((m_{O}, N)\) pairs. The resulting sequence of 4250 triples \((H, m_{O}, N)\) is replaced by a single 12-term bivariate polynomial. To determine its 12 coefficients and the degrees of the 12 associated terms, a genetic algorithm was applied. The validity of the resulting formula is tested by determining the architecture of twelve MLPs for as many problems and verifying that the RMS error is minimal when using it to determine H.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Article Open access 26 July 2022

Development and Application of Artificial Neural Network

Article 30 December 2017

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

In many cases the pdf’s are approximately symmetrical. In which case 90% of the observations will lie to the right of \(C_{min}\). Otherwise, \(\hbox {P}(C>C_{min})>\hbox {0.8.}\)
https://archive.ics.uci.edu/ml/datasets/Census+Income.
https://archive.ics.uci.edu/ml/datasets/Hepatitis.
https://archive.ics.uci.edu/ml/datasets/Wine.
https://archive.ics.uci.edu/ml/datasets/Yeast.
https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset.
https://archive.ics.uci.edu/ml/datasets/Tennis+Major+Tournament+Match+Statistics.
https://archive.ics.uci.edu/ml/datasets/Computer+Hardware.

References

Alistair M (1990) Implementing the PPM data compression scheme. IEEE Trans Commun 38(11):1917–1921
Article Google Scholar
Ash T (1989) Dynamic node creation in backpropagation networks. Connect Sci 1(4):365–375
Article Google Scholar
Barron AR (1994) Approximation and estimation bounds for artificial neural networks. Mach Learn 14:115–133
MATH Google Scholar
Bohanec M, Rajkovic V (1990) Expert system for decision making. Sistemica 1(1):145–157. https://archive.ics.uci.edu/ml/datasets/Car+Evaluation
Cheney EW (1966) Introduction to approximation theory. McGraw-Hill, New York, pp 45–51
Ein-Dor P, Feldmesser Ein-Dor J Computer Hardware Data Set. Faculty of Management, Ramat-Aviv. https://archive.ics.uci.edu/ml/datasets/Computer+Hardware
Fahlman SE (1988) An empirical study of learning speed in back propagation networks. In: Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufman
Fanaee-T H Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto. https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset
Fletcher L, Katkovnik V, Steffens FE, Engelbrecht AP (1998) Optimizing the number of hidden nodes of a feedforward artificial neural network. In: Proceedings of the IEEE International Joint Conference on Neural Networks, vol 2, pp 1608–1612
Forina M et al Wine data set. PARVUS, Via Brigata Salerno. https://archive.ics.uci.edu/ml/datasets/Wine
Funahashi KI, Nakamura Y (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw 6(6):801–806
Article Google Scholar
George C (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314
Article MathSciNet MATH Google Scholar
Gong G Carnegie-Mellon University, Bojan Cestnik, Jozef Stefan Institute. https://archive.ics.uci.edu/ml/datasets/Hepatitis
Haykin SS et al (2009) Neural networks and learning machines, vol 3. Pearson Education, Upper Saddle River
Google Scholar
Hearst MA, Dumais ST, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
Hecht-Nielsen R(1989) Theory of the backpropagation neural network. In: IEEE International Joint Conference on Neural Networks, 1989. IJCNN. pp 593–605
Hirose Y, Yamashita IC, Hijiya S (1991) Back-propagation algorithm which varies the number of hidden units. Neural Netw 4:61–66
Article Google Scholar
Jau-hari S, Morankar A, Fokoue E Rochester Institute of Technology. https://archive.ics.uci.edu/ml/datasets/Tennis+Major+Tournament+Match+Statistics
Kohavi R, Becker B Data mining and visualization. Silicon graphics. https://archive.ics.uci.edu/ml/datasets/Census+Income
Kuri-Morales A, Aldana-Bobadilla E (2013) The best genetic algorithm I. In: Advances in soft computing and its applications. Springer, Berlin, pp 1–15
Kuri-Morales A, Cartas-Ayala A (2014) Polynomial multivariate approximation with genetic algorithms. In: Canadian Conference on Artificial Intelligence. Springer International Publishing, pp 307–312
Kuri-Morales A, Aldana-Bobadilla E, López-Peña I (2013) The best genetic algorithm II. In: Advances in soft computing and its applications. Springer, Berlin, pp 16–29
Kurt H, Maxwell S, Halbert W (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Li M, Vitányi P (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New York
Book MATH Google Scholar
Medeiros CMS, Guilherme AB (2013) A novel weight pruning method for MLP classifiers based on the MAXCORE principle. Neural Comput Appl 22(1):71–84
Article Google Scholar
Nakai Kenta Institue of Molecular and Cellular Biology, Osaka, University. https://archive.ics.uci.edu/ml/datasets/Yeast
Nash Warwick J, Sellers Tracy L, Talbot Simon R, Cawthorn Andrew J, Ford Wes B (1994) The Population Biology of Abalone (_Haliotis_ species) in Tasmania. I. Blacklip Abalone (_H. rubra_) from the North Coast and Islands of Bass Strait. Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288). https://archive.ics.uci.edu/ml/datasets/Abalone
Networks N (1999) A comprehensive foundation, 2nd edn. Ch. 4, p 294, Notes and References 8, Prentice Hall International
Noboru M, Shuji Y, Shun-ichi A (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5(6):865–872
Article Google Scholar
Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3(2):246–257
Article Google Scholar
Reed R (1993) Pruning algorithms a survey. IEEE Trans Neural Netw 4(5):707–740
Article Google Scholar
Rivals I, Personnaz L (2000) A statistical procedure for determining the optimal number of hidden neurons of a neural model. In: Second International Symposium on Neural Computation (NC’2000), Berlin, May 23–26
Saw JG, Yang MC, Mo TC (1984) Chebyshev inequality with estimated mean and variance. Am Stat 38(2):130–132
MathSciNet Google Scholar
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Article Google Scholar
Shampine LF, Allen RC (1973) Numerical computing: an introduction. Harcourt Brace College Publishers, San Diego
MATH Google Scholar
Teoh EJ, Tan KC, Xiang C (2006) Estimating the number of hidden neurons in a feedforward network using the singular value decomposition. IEEE Trans Neural Netw 17(6):1623–1629
Article Google Scholar
Vladimir V (2000) The nature of statistical learning theory. Springer, Berlin
Google Scholar
Xin Y (1999) Evolving artificial neural networks. IEEE Proc 87(9):1423–1447
Article Google Scholar
Xu L (1995) Ying-Yang machine: a Bayesian- Kullback scheme for unified learnings and new results on vector quantization. In: Keynote talk, Proceedings of International Conference on Neural Information Processing (ICONIP95), Oct. 30–NOV. 3, pp 977–988
Xu L (1997) Bayesian Ying-Yang System and Theory as A Unified Statistical Learning Approach: (III) Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. Lecture Notes in Computer Science: Proc. Of International Workshop on Theoretical Aspects of Neural Computation, May 26–28, 1997, Hong Kong, Springer, pp 43–60
Xu S, Chen L (2008) Novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining. In: International Conference on Information Technology and Applications: iCITA. 2008. pp 683–686

Download references

Acknowledgments

The author acknowledges the support of the Asociación Mexicana de Cultura, A.C.

Author information

Authors and Affiliations

Instituto Tecnológico Autónomo de México, Río Hondo No. 1, 01000, Mexico, D.F., Mexico
Angel Kuri-Morales

Authors

Angel Kuri-Morales
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Angel Kuri-Morales.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by the author.

Additional information

Communicated by H. Ponce.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuri-Morales, A. Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network. Soft Comput 21, 597–609 (2017). https://doi.org/10.1007/s00500-016-2416-3

Download citation

Published: 02 November 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s00500-016-2416-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Development and Application of Artificial Neural Network

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Development and Application of Artificial Neural Network

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation