Abstract
Estimating a priori the size of neural networks for achieving high classification accuracy is a hard problem. Existing studies provide theoretical upper bounds on the size of neural networks that are unrealistic to implement. This work provides a computational study for estimating the size of neural networks using as an estimation parameter the size of available training data. We will also show that the size of a neural network is problem dependent and that one only needs the number of available training data to determine the size of the required network for achieving high classification rate. We use for our experiments a threshold neural network that combines the perceptron algorithm with simulated annealing and we tested our results on datasets from the UCI Machine Learning Repository. Based on our experimental results, we propose a formula to estimate the number of perceptrons that have to be trained in order to achieve a high classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Albrecht, A., Lappas, G., Vinterbo, S.A., Wong, C.K., Ohno-Machado, L.: Two applications of the LSA machine. In: ICONIP’02. Proc. 9th International Conference on Neural Information Processing, Singapore, pp. 184–189 (2002)
Albrecht, A., Wong, C.K.: Combining the perceptron algorithm with logarithmic simulated annealing. Neural Processing Letters 14, 75–83 (2001)
Allender, E.: The permanent requires large uniform threshold circuits. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 127–135. Springer, Heidelberg (1996)
Anthony, M.: Boolean Functions and Artificial Neural Networks. CDAM Research Report LSE-CDAM-2003-01, Department of Mathematics and Centre for Discrete and Applicable Mathematics, The London School of Economics and Political Science, London, UK (2003)
Anthony, M., Bartlett, P.L.: Neural Network Learning. Cambridge University Press, Cambridge (1999)
Bartlett, P.L.: The Sample Complexity of Pattern Classification with Neural Networks: the Size of the Weights is More Important Than the Size of the Network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)
Bartlett, P.L., Maiorov, V., Meir, R.: Almost Linear VC-dimension Bounds for Piecewise Polynomial Networks. Neural Computation 10, 2159–2173 (1998)
Baum, E.B.: On the Capabilities of Multilayer Perceptrons. Journal of Complexity 4, 193–215 (1988)
Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Neural Computation 1(1), 151–160 (1989)
Beiu, V.: Digital Integrated Circuit Implementations. In: Fiesler, E., Beale, R. (eds.) Handbook on Neural Computation, Oxford University Press, Oxford (1997)
Blum, A., Rivest, R.L.: Training a 3-Node Neural Network is NP-Complete. Neural Networks 5, 117–127 (1992)
Caussinus, H., McKenzie, P., Thérien, D., Vollmer, H.: Nondeterministic NC1 computation. J. Comput. System Sci. 57, 200–212 (1998)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, New York (2001)
Goldberg, P.W., Jerrum, M.R.: Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parametrised by Real Numbers. Machine Learning 18(2/3), 131–148 (1995)
Hajek, B.: Cooling schedules for optimal annealing. Mathem. Operat. Res 13, 311–329 (1988)
Haussler, D., Kearns, M., Schapire, R.: Bounds on the Sample Complexity of Bayesian Learning Usng Information Theory and the VC Dimension. Machine Learning 14, 83–113 (1994)
Höffgen, K.-U., Simon, H.-U.: Robust Trainability of Single Neurons. In: Proc. 5th Annual ACM Workshop on Computational Learning Theory, pp. 428–439. ACM Press, New York (1992)
Karpinski, M., Macintyre, A.J.: Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks. Journal of Computer and System Sciences 54(1), 169–176 (1997)
Koiran, P., Sontag, E.D.: Neural Networks with Quadratic VC Dimension. Journal of Computer and System Sciences 54(1), 190–198 (1997)
Kolmogorov, A.N.: On the Representation of Continuous Functions of Several Variables by Superposition of Continuous Functions of one Variable and Addition. Doklady Akademiia Nauk 114(5), 953–956 (1957)
Lappas, G., Frank, R.J., Albrecht, A.: A Computational Study on Circuit Size vs. Circuit Depth. International Journal on Artificial Intelligence Tools 15(2), 143–162 (2006)
Lupanov, O.B.: On a Method to Design Control Systems - The Local Encoding Approach (in Russian). Problemy Kibernetiki 14, 31–110 (1965)
Lupanov, O.B.: On the design of circuits by threshold elements (in Russian). Problemy Kibernetiki 26, 109–140 (1973)
Maass, W.: Bounds on the computational power and learning complexity of analog neural nets. In: Proc. 25th Annual ACM Symp. on the Theory of Computing, pp. 335–344. ACM Press, New York (1993)
Maass, W.: On the complexity of learning on neural nets. In: Maass, W. (ed.) EuroColt’93. Proc. Computational Learning Theory, pp. 1–17. Oxford University Press, Oxford (1994)
Maass, W., Legenstein, R.A., Bertschinger, N.: Methods for estimating the computational power and generalization capability of neural microcircuits. In: Proc. Advances in Neural Information Processing Systems (2005)
Razborov, A., Wigderson, A.: n Ω(logn) Lower Bounds on the Size of Depth 3 Threshold Circuits with AND Gates at the Bottom. In: Inf. Proc. Letters, 45th edn., pp. 303–307 (1993)
Rosenblatt, F.: Principles of Neurodynamics. Spartan Books, New York (1962)
Sontag, E.D.: VC-dimension of Neural Networks. In: Bishop, C.M. (ed.) Neural Networks and Machine Learning, pp. 69–95. Springer, Heidelberg (1998)
Vapnik, V., Chervonenkis, A.Y.: On the Uniform Convergence of Relative Frequencies of Events to their Probabilities. Theory of Probability and Its Applications 16(2), 264–280 (1971)
Vollmer, H.: Some Aspects of the Computational Power of Boolean Circuits of Small Depth. University Würzburg, Habilitationsschrift (1999)
Wenocur, R.S., Dudley, R.M.: Some Special Vapnik-chervonenkis Classes. Discrete Mathematics 33, 313–318 (1981)
Wolpert, D.H., Macready, W.G.: No Free Lunch Theorems for Optimization. IEEE Trans. on Evolutionary Computation 1, 67–82 (1997)
Yannakakis, M.: Computational Complexity. In: Aarts, E.H.L., Lenstra, J.K. (eds.) Local Search in Combinatorial Optimization, Wiley & Sons, Chichester (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lappas, G. (2007). Estimating the Size of Neural Networks from the Number of Available Training Data. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4668. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74690-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-74690-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74689-8
Online ISBN: 978-3-540-74690-4
eBook Packages: Computer ScienceComputer Science (R0)