Abstract
Multilayered, feedforward neural network techniques have been proposed for a variety of classification and recognition problems ranging from speech to sonar signal processing problems. It is generally assumed that the underlying application does not need to be modeled very much and that an artificial neural network solution can be obtained instead by training from empirical data with little or no a priori information about the application. We argue that the right network architecture is fundamental for a good solution to exist and the class of network architectures forms a basis for a complexity theory of classification problems. An abstraction of this notion of complexity leads to ideas similar to Kolmogorov's minimum length description criterion, entropy and k-widths. We will present some basic results on this measure of complexity. From this point of view, artificial neural network solutions to real engineering problems may not ameliorate the difficulties of classification problems, but rather obscure and postpone them. In particular, we doubt that the design of neural networks for solving interesting nontrivial engineering problems will be any easier than other large scale engineering design problems (such as in aerodynamics and semiconductor device modeling).
Supported in part by NSF grant MIP-89-11025, AFOSR/DARPA contract 89-0536 and DOE grant DE-FG02-85ER25001.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. Anderson and E. Rosenfeld, Neurocomputing, M. I. T. Press, Cambridge, 1988.
R. Ash, Real Analysis and Probability, Academic Press, New York, 1972.
A. Barron and T. M. Cover, Minimum complexity density estimation, submitted to IEEE Trans. on Info. Theory, January 1989.
E. Baum and D. Haussler, What size net gives valid generalization, Neural Computation, (to appear).
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. Warmuth, Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension, in Proceedings 18th ACM Symposium on Theory of Computation, 1986, pp. 273–282.
A. Blumer et al., Learnability and the Vapnik-Chervonenkis dimension, Tech. Rep. UCSC-CRL-87-20, UC Santa Cruz, 1987.
M. Buhmann, Multivariate interpolation in odd dimensional Euclidean spaces using multiquadratics, Tech. Rep. DAMTP 1988/NA6, University of Cambridge, Dept. of Appl. Math. and Theor. Physics, 1988.
D. Burr, Experiments on neural net recognition of spoken and written text, IEEE Trans. Acoust. Speech and Signal Process., 36 (1988), pp. 1162–1168.
S. Carroll and B. Dickinson, Construction of neural nets using the Radon transform, preprint, 1989.
J. Cowan and D. Sharp, Neural nets and artificial intelligence, Daedalus, 117 (Winter 1988), pp. 85–122.
G. Cybenko, Approximation by superpositions of a single function, Mathematics of Control, Signals and Systems, 2 (1989), pp. 303–314.
—, Mathematical problems in neural computing, in Proceedings of the International Symposium on Mathematical Theory of Networks and Systems, Amsterdam, 1989.
J. M. D. Rumelhart, Parallel Distributed Processing, M. I. T. Press, Cambridge, 1987.
G. H. D. E. Rumelhart and J. McClelland, A General Framework for Parallel Distributed Processing, MIT Press, 1986.
R. Duda and P. Hart, Pattern Classification and Scene Analysis, Wiley, 1973.
R. Gorman and T. Sejnowski, Learned classification of sonar targets using a massively parallel network, IEEE Trans. Acoust. Speech and Signal Process., 36 (1988), pp. 1135–1140.
A. Griewank, On automatic differentiation, to appear in Mathematical Programming 88, Kluwer Academic Publishers, 1989.
D. Haussler, Generalizing the PAC model for neural net and other applications, Tech. Rep. UCSC-CRL-89-30, Computer Research Laboratory, UC-Santa Cruz, 1989.
K. Hornik, M. Stinchcombe, and H. White, Multi-layer feed-forward networks are universal approximators, preprint, 1988.
L. K. Jones, Constructive approximations for neural networks by sigmoidal functions, preprint, 1988.
R. Lippmann, An introduction to computing with neural nets, IEEE ASSP Magazine, 16 (1987), pp. 4–22.
G. Lorentz, Metric entropy, widths, and superpositions of functions, American Math. Monthly, 69 (1962), pp. 469–485.
J. Makhoul, R. Schwartz, and A. El-Jaroudi, Classification capabilities of two-layer neural nets, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Glasgow, Scotland, 1989.
Proceedings of the 1st workshop on computational learning theory, August 1988.
F. Rosenblatt, Principles of Neurodynamics, Spartan Books, Washington D.C., 1961.
T. Sejnowski and C. Rosenberg, Parallel networks that learn to pronounce english text, Complex Systems, 1 (1987), pp. 145–168.
L. Valiant, A theory of the learnable, Communications of the ACM, 27:11 (1984), pp. 1134–1142.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1990 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cybenko, G. (1990). Complexity theory of neural networks and classification problems. In: Almeida, L.B., Wellekens, C.J. (eds) Neural Networks. EURASIP 1990. Lecture Notes in Computer Science, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-52255-7_25
Download citation
DOI: https://doi.org/10.1007/3-540-52255-7_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-52255-3
Online ISBN: 978-3-540-46939-1
eBook Packages: Springer Book Archive