Estimating the Size of Neural Networks from the Number of Available Training Data

Lappas, Georgios

doi:10.1007/978-3-540-74690-4_8

Georgios Lappas¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4668))

Included in the following conference series:

International Conference on Artificial Neural Networks

2755 Accesses
3 Citations

Abstract

Estimating a priori the size of neural networks for achieving high classification accuracy is a hard problem. Existing studies provide theoretical upper bounds on the size of neural networks that are unrealistic to implement. This work provides a computational study for estimating the size of neural networks using as an estimation parameter the size of available training data. We will also show that the size of a neural network is problem dependent and that one only needs the number of available training data to determine the size of the required network for achieving high classification rate. We use for our experiments a threshold neural network that combines the perceptron algorithm with simulated annealing and we tested our results on datasets from the UCI Machine Learning Repository. Based on our experimental results, we propose a formula to estimate the number of perceptrons that have to be trained in order to achieve a high classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albrecht, A., Lappas, G., Vinterbo, S.A., Wong, C.K., Ohno-Machado, L.: Two applications of the LSA machine. In: ICONIP’02. Proc. 9^th International Conference on Neural Information Processing, Singapore, pp. 184–189 (2002)
Google Scholar
Albrecht, A., Wong, C.K.: Combining the perceptron algorithm with logarithmic simulated annealing. Neural Processing Letters 14, 75–83 (2001)
Article MATH Google Scholar
Allender, E.: The permanent requires large uniform threshold circuits. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 127–135. Springer, Heidelberg (1996)
Google Scholar
Anthony, M.: Boolean Functions and Artificial Neural Networks. CDAM Research Report LSE-CDAM-2003-01, Department of Mathematics and Centre for Discrete and Applicable Mathematics, The London School of Economics and Political Science, London, UK (2003)
Google Scholar
Anthony, M., Bartlett, P.L.: Neural Network Learning. Cambridge University Press, Cambridge (1999)
MATH Google Scholar
Bartlett, P.L.: The Sample Complexity of Pattern Classification with Neural Networks: the Size of the Weights is More Important Than the Size of the Network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)
Article MATH MathSciNet Google Scholar
Bartlett, P.L., Maiorov, V., Meir, R.: Almost Linear VC-dimension Bounds for Piecewise Polynomial Networks. Neural Computation 10, 2159–2173 (1998)
Article Google Scholar
Baum, E.B.: On the Capabilities of Multilayer Perceptrons. Journal of Complexity 4, 193–215 (1988)
Article MATH MathSciNet Google Scholar
Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Neural Computation 1(1), 151–160 (1989)
Article Google Scholar
Beiu, V.: Digital Integrated Circuit Implementations. In: Fiesler, E., Beale, R. (eds.) Handbook on Neural Computation, Oxford University Press, Oxford (1997)
Google Scholar
Blum, A., Rivest, R.L.: Training a 3-Node Neural Network is NP-Complete. Neural Networks 5, 117–127 (1992)
Article Google Scholar
Caussinus, H., McKenzie, P., Thérien, D., Vollmer, H.: Nondeterministic NC¹ computation. J. Comput. System Sci. 57, 200–212 (1998)
Article MATH MathSciNet Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, New York (2001)
MATH Google Scholar
Goldberg, P.W., Jerrum, M.R.: Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parametrised by Real Numbers. Machine Learning 18(2/3), 131–148 (1995)
Article MATH Google Scholar
Hajek, B.: Cooling schedules for optimal annealing. Mathem. Operat. Res 13, 311–329 (1988)
MATH MathSciNet Google Scholar
Haussler, D., Kearns, M., Schapire, R.: Bounds on the Sample Complexity of Bayesian Learning Usng Information Theory and the VC Dimension. Machine Learning 14, 83–113 (1994)
MATH Google Scholar
Höffgen, K.-U., Simon, H.-U.: Robust Trainability of Single Neurons. In: Proc. 5^th Annual ACM Workshop on Computational Learning Theory, pp. 428–439. ACM Press, New York (1992)
Chapter Google Scholar
Karpinski, M., Macintyre, A.J.: Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks. Journal of Computer and System Sciences 54(1), 169–176 (1997)
Article MATH MathSciNet Google Scholar
Koiran, P., Sontag, E.D.: Neural Networks with Quadratic VC Dimension. Journal of Computer and System Sciences 54(1), 190–198 (1997)
Article MATH MathSciNet Google Scholar
Kolmogorov, A.N.: On the Representation of Continuous Functions of Several Variables by Superposition of Continuous Functions of one Variable and Addition. Doklady Akademiia Nauk 114(5), 953–956 (1957)
MATH MathSciNet Google Scholar
Lappas, G., Frank, R.J., Albrecht, A.: A Computational Study on Circuit Size vs. Circuit Depth. International Journal on Artificial Intelligence Tools 15(2), 143–162 (2006)
Article Google Scholar
Lupanov, O.B.: On a Method to Design Control Systems - The Local Encoding Approach (in Russian). Problemy Kibernetiki 14, 31–110 (1965)
MathSciNet Google Scholar
Lupanov, O.B.: On the design of circuits by threshold elements (in Russian). Problemy Kibernetiki 26, 109–140 (1973)
MathSciNet Google Scholar
Maass, W.: Bounds on the computational power and learning complexity of analog neural nets. In: Proc. 25^th Annual ACM Symp. on the Theory of Computing, pp. 335–344. ACM Press, New York (1993)
Google Scholar
Maass, W.: On the complexity of learning on neural nets. In: Maass, W. (ed.) EuroColt’93. Proc. Computational Learning Theory, pp. 1–17. Oxford University Press, Oxford (1994)
Google Scholar
Maass, W., Legenstein, R.A., Bertschinger, N.: Methods for estimating the computational power and generalization capability of neural microcircuits. In: Proc. Advances in Neural Information Processing Systems (2005)
Google Scholar
Razborov, A., Wigderson, A.: n ^Ω(logn) Lower Bounds on the Size of Depth 3 Threshold Circuits with AND Gates at the Bottom. In: Inf. Proc. Letters, 45th edn., pp. 303–307 (1993)
Google Scholar
Rosenblatt, F.: Principles of Neurodynamics. Spartan Books, New York (1962)
MATH Google Scholar
Sontag, E.D.: VC-dimension of Neural Networks. In: Bishop, C.M. (ed.) Neural Networks and Machine Learning, pp. 69–95. Springer, Heidelberg (1998)
Google Scholar
Vapnik, V., Chervonenkis, A.Y.: On the Uniform Convergence of Relative Frequencies of Events to their Probabilities. Theory of Probability and Its Applications 16(2), 264–280 (1971)
Article MATH MathSciNet Google Scholar
Vollmer, H.: Some Aspects of the Computational Power of Boolean Circuits of Small Depth. University Würzburg, Habilitationsschrift (1999)
Google Scholar
Wenocur, R.S., Dudley, R.M.: Some Special Vapnik-chervonenkis Classes. Discrete Mathematics 33, 313–318 (1981)
Article MATH MathSciNet Google Scholar
Wolpert, D.H., Macready, W.G.: No Free Lunch Theorems for Optimization. IEEE Trans. on Evolutionary Computation 1, 67–82 (1997)
Article Google Scholar
Yannakakis, M.: Computational Complexity. In: Aarts, E.H.L., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization, Wiley & Sons, Chichester (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Technological Educational Institution (TEI) of Western Macedonia, Kastoria Campus, P.O. Box 30, GR 52100 Kastoria, Greece
Georgios Lappas

Authors

Georgios Lappas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joaquim Marques de Sá Luís A. Alexandre Włodzisław Duch Danilo Mandic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lappas, G. (2007). Estimating the Size of Neural Networks from the Number of Available Training Data. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4668. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74690-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-74690-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74689-8
Online ISBN: 978-3-540-74690-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics