Skip to main content

Estimating the Size of Neural Networks from the Number of Available Training Data

  • Conference paper
Book cover Artificial Neural Networks – ICANN 2007 (ICANN 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4668))

Included in the following conference series:

Abstract

Estimating a priori the size of neural networks for achieving high classification accuracy is a hard problem. Existing studies provide theoretical upper bounds on the size of neural networks that are unrealistic to implement. This work provides a computational study for estimating the size of neural networks using as an estimation parameter the size of available training data. We will also show that the size of a neural network is problem dependent and that one only needs the number of available training data to determine the size of the required network for achieving high classification rate. We use for our experiments a threshold neural network that combines the perceptron algorithm with simulated annealing and we tested our results on datasets from the UCI Machine Learning Repository. Based on our experimental results, we propose a formula to estimate the number of perceptrons that have to be trained in order to achieve a high classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albrecht, A., Lappas, G., Vinterbo, S.A., Wong, C.K., Ohno-Machado, L.: Two applications of the LSA machine. In: ICONIP’02. Proc. 9th International Conference on Neural Information Processing, Singapore, pp. 184–189 (2002)

    Google Scholar 

  2. Albrecht, A., Wong, C.K.: Combining the perceptron algorithm with logarithmic simulated annealing. Neural Processing Letters 14, 75–83 (2001)

    Article  MATH  Google Scholar 

  3. Allender, E.: The permanent requires large uniform threshold circuits. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 127–135. Springer, Heidelberg (1996)

    Google Scholar 

  4. Anthony, M.: Boolean Functions and Artificial Neural Networks. CDAM Research Report LSE-CDAM-2003-01, Department of Mathematics and Centre for Discrete and Applicable Mathematics, The London School of Economics and Political Science, London, UK (2003)

    Google Scholar 

  5. Anthony, M., Bartlett, P.L.: Neural Network Learning. Cambridge University Press, Cambridge (1999)

    MATH  Google Scholar 

  6. Bartlett, P.L.: The Sample Complexity of Pattern Classification with Neural Networks: the Size of the Weights is More Important Than the Size of the Network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  7. Bartlett, P.L., Maiorov, V., Meir, R.: Almost Linear VC-dimension Bounds for Piecewise Polynomial Networks. Neural Computation 10, 2159–2173 (1998)

    Article  Google Scholar 

  8. Baum, E.B.: On the Capabilities of Multilayer Perceptrons. Journal of Complexity 4, 193–215 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  9. Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Neural Computation 1(1), 151–160 (1989)

    Article  Google Scholar 

  10. Beiu, V.: Digital Integrated Circuit Implementations. In: Fiesler, E., Beale, R. (eds.) Handbook on Neural Computation, Oxford University Press, Oxford (1997)

    Google Scholar 

  11. Blum, A., Rivest, R.L.: Training a 3-Node Neural Network is NP-Complete. Neural Networks 5, 117–127 (1992)

    Article  Google Scholar 

  12. Caussinus, H., McKenzie, P., Thérien, D., Vollmer, H.: Nondeterministic NC1 computation. J. Comput. System Sci. 57, 200–212 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  13. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, New York (2001)

    MATH  Google Scholar 

  14. Goldberg, P.W., Jerrum, M.R.: Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parametrised by Real Numbers. Machine Learning 18(2/3), 131–148 (1995)

    Article  MATH  Google Scholar 

  15. Hajek, B.: Cooling schedules for optimal annealing. Mathem. Operat. Res 13, 311–329 (1988)

    MATH  MathSciNet  Google Scholar 

  16. Haussler, D., Kearns, M., Schapire, R.: Bounds on the Sample Complexity of Bayesian Learning Usng Information Theory and the VC Dimension. Machine Learning 14, 83–113 (1994)

    MATH  Google Scholar 

  17. Höffgen, K.-U., Simon, H.-U.: Robust Trainability of Single Neurons. In: Proc. 5th Annual ACM Workshop on Computational Learning Theory, pp. 428–439. ACM Press, New York (1992)

    Chapter  Google Scholar 

  18. Karpinski, M., Macintyre, A.J.: Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks. Journal of Computer and System Sciences 54(1), 169–176 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  19. Koiran, P., Sontag, E.D.: Neural Networks with Quadratic VC Dimension. Journal of Computer and System Sciences 54(1), 190–198 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  20. Kolmogorov, A.N.: On the Representation of Continuous Functions of Several Variables by Superposition of Continuous Functions of one Variable and Addition. Doklady Akademiia Nauk 114(5), 953–956 (1957)

    MATH  MathSciNet  Google Scholar 

  21. Lappas, G., Frank, R.J., Albrecht, A.: A Computational Study on Circuit Size vs. Circuit Depth. International Journal on Artificial Intelligence Tools 15(2), 143–162 (2006)

    Article  Google Scholar 

  22. Lupanov, O.B.: On a Method to Design Control Systems - The Local Encoding Approach (in Russian). Problemy Kibernetiki 14, 31–110 (1965)

    MathSciNet  Google Scholar 

  23. Lupanov, O.B.: On the design of circuits by threshold elements (in Russian). Problemy Kibernetiki 26, 109–140 (1973)

    MathSciNet  Google Scholar 

  24. Maass, W.: Bounds on the computational power and learning complexity of analog neural nets. In: Proc. 25th Annual ACM Symp. on the Theory of Computing, pp. 335–344. ACM Press, New York (1993)

    Google Scholar 

  25. Maass, W.: On the complexity of learning on neural nets. In: Maass, W. (ed.) EuroColt’93. Proc. Computational Learning Theory, pp. 1–17. Oxford University Press, Oxford (1994)

    Google Scholar 

  26. Maass, W., Legenstein, R.A., Bertschinger, N.: Methods for estimating the computational power and generalization capability of neural microcircuits. In: Proc. Advances in Neural Information Processing Systems (2005)

    Google Scholar 

  27. Razborov, A., Wigderson, A.: n Ω(logn) Lower Bounds on the Size of Depth 3 Threshold Circuits with AND Gates at the Bottom. In: Inf. Proc. Letters, 45th edn., pp. 303–307 (1993)

    Google Scholar 

  28. Rosenblatt, F.: Principles of Neurodynamics. Spartan Books, New York (1962)

    MATH  Google Scholar 

  29. Sontag, E.D.: VC-dimension of Neural Networks. In: Bishop, C.M. (ed.) Neural Networks and Machine Learning, pp. 69–95. Springer, Heidelberg (1998)

    Google Scholar 

  30. Vapnik, V., Chervonenkis, A.Y.: On the Uniform Convergence of Relative Frequencies of Events to their Probabilities. Theory of Probability and Its Applications 16(2), 264–280 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  31. Vollmer, H.: Some Aspects of the Computational Power of Boolean Circuits of Small Depth. University Würzburg, Habilitationsschrift (1999)

    Google Scholar 

  32. Wenocur, R.S., Dudley, R.M.: Some Special Vapnik-chervonenkis Classes. Discrete Mathematics 33, 313–318 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  33. Wolpert, D.H., Macready, W.G.: No Free Lunch Theorems for Optimization. IEEE Trans. on Evolutionary Computation 1, 67–82 (1997)

    Article  Google Scholar 

  34. Yannakakis, M.: Computational Complexity. In: Aarts, E.H.L., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization, Wiley & Sons, Chichester (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joaquim Marques de Sá Luís A. Alexandre Włodzisław Duch Danilo Mandic

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lappas, G. (2007). Estimating the Size of Neural Networks from the Number of Available Training Data. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4668. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74690-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74690-4_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74689-8

  • Online ISBN: 978-3-540-74690-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics