Abstract
We introduce the notion of solvable models of artificial neural networks, based on the theory of ordinary differential equations. It is shown that a solvable, three layer, neural network can be realized as a solution of an ordinary differential equation. Several neural networks in standard use are shown to be solvable. This leads to a new, two-step, non-recursive learning paradigm: estimate the differential equation which the target function satisfies “approximately”, and then approximate the target function in the solution space of that differential equation. It is shown experimentally that the proposed algorithm is useful for analyzing the generalization problem in artificial neural networks. Connections with wavelet analysis are also pointed out.
Similar content being viewed by others
References
H. Akaike, A new look at the statistical model identification, IEEE Trans. on Automatic Control AC-19, 1974, 716–723
P. Baldi and K. Hornik, Neural networks and principal component analysis: learning from examples without local minima, Neural Networks 2, 1989, 53–58.
A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. on Information Theory 39, 1993, 930–945.
E.B. Baum and D. Haussler, What size net gives valid generalization?, Neural Computation 1, 1989, 151–160.
R.J. Baxter,Exactly Solved Models in Statistical Mechanics, Academic Press, 1982.
H. Bourlard and Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics 59, 1988, 291–294.
T. Cacoullos, Estimation of a multivariate density, Ann. Inst. Statist. Math. 18, 1964, 179–189.
C.K. Chui,An Introduction to Wavelets, Academic Press, London, 1992.
E.A. Coddington and N. Levinson,Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955.
G. Cybenko, Approximation by superposition of sigmoidal function, Math. Control Signals Systems 2, 1989, 303–314.
K. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural Networks 2, 1989, 283–314.
K. Hagiwara, N. Toda and S. Usui, On the problem of applying AIC to determine the structure of a layered feed-forward neural network,Proc. of IJCNN (Nagoya), 1993, pp. 2263–2266.
I.O. Kerner, Ein Gesamtschrittverfahren zur Berechnung der Nullstellen von Polynomen, Numer. Math. 8, 1966, 290–294.
J.D. Markel and A.H. Gray, Jr.,Linear Prediction of Speech, Springer, Berlin, 1976.
E. Parzen, On estimation of a probability density function and mode, Ann. Math. Statist. 33, 1962, 1065–1076.
T. Poggio and F. Girosi, Networks for approximation and learning, Proc. IEEE 78, 1990, 1481–1497.
M. Reed and B. Simon,Methods of Modern Mathematical Physics, Vol. 1; Functional Analysis, Academic Press, San Diego, 1980.
J. Rissanen, Universal coding, information, prediction, and estimation, IEEE Trans. on Information Theory 30, 1984, 629–636.
D.E. Rumelhart, J.L. McClelland and the PDP Research Group,Parallel Distributed Processing, Vol. 1, MIT Press, 1986.
D.F. Specht, A general regression neural network, IEEE Trans. on Neural Networks 2, 1991, 568–576.
S. Watanabe, Fourier analysis of neural networks, IEICE Technical Report NC91-63, 1991, pp. 111–118 (in Japanese).
S. Watanabe, Differential equations accompanying neural networks and solvable nonlinear learning machines,Proc. of IJCNN (Nagoya), 1993, pp. 2698–2671.
S. Watanabe, Solvable models of artifical neural networks,Advances in Neural Information Processing Systems, Vol. 6, Morgan Kaufmann, San Mateo, 1994, pp. 423–430.
H. White, Learning in artificial neural networks: a statistical perspective, Neural Computation 1, 1989, 425–464.
N. Wiener,The Fourier Integral and Certain of its Applications, Cambridge Univ. Press, 1933.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Watanabe, S. Solvable models of layered neural networks based on their differential structure. Adv Comput Math 5, 205–231 (1996). https://doi.org/10.1007/BF02124744
Issue Date:
DOI: https://doi.org/10.1007/BF02124744