Skip to main content
Log in

Solvable models of layered neural networks based on their differential structure

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

We introduce the notion of solvable models of artificial neural networks, based on the theory of ordinary differential equations. It is shown that a solvable, three layer, neural network can be realized as a solution of an ordinary differential equation. Several neural networks in standard use are shown to be solvable. This leads to a new, two-step, non-recursive learning paradigm: estimate the differential equation which the target function satisfies “approximately”, and then approximate the target function in the solution space of that differential equation. It is shown experimentally that the proposed algorithm is useful for analyzing the generalization problem in artificial neural networks. Connections with wavelet analysis are also pointed out.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. H. Akaike, A new look at the statistical model identification, IEEE Trans. on Automatic Control AC-19, 1974, 716–723

    Article  Google Scholar 

  2. P. Baldi and K. Hornik, Neural networks and principal component analysis: learning from examples without local minima, Neural Networks 2, 1989, 53–58.

    Article  Google Scholar 

  3. A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. on Information Theory 39, 1993, 930–945.

    Article  Google Scholar 

  4. E.B. Baum and D. Haussler, What size net gives valid generalization?, Neural Computation 1, 1989, 151–160.

    Google Scholar 

  5. R.J. Baxter,Exactly Solved Models in Statistical Mechanics, Academic Press, 1982.

  6. H. Bourlard and Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics 59, 1988, 291–294.

    Article  Google Scholar 

  7. T. Cacoullos, Estimation of a multivariate density, Ann. Inst. Statist. Math. 18, 1964, 179–189.

    Google Scholar 

  8. C.K. Chui,An Introduction to Wavelets, Academic Press, London, 1992.

    Google Scholar 

  9. E.A. Coddington and N. Levinson,Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955.

    Google Scholar 

  10. G. Cybenko, Approximation by superposition of sigmoidal function, Math. Control Signals Systems 2, 1989, 303–314.

    Google Scholar 

  11. K. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural Networks 2, 1989, 283–314.

    Article  Google Scholar 

  12. K. Hagiwara, N. Toda and S. Usui, On the problem of applying AIC to determine the structure of a layered feed-forward neural network,Proc. of IJCNN (Nagoya), 1993, pp. 2263–2266.

  13. I.O. Kerner, Ein Gesamtschrittverfahren zur Berechnung der Nullstellen von Polynomen, Numer. Math. 8, 1966, 290–294.

    Google Scholar 

  14. J.D. Markel and A.H. Gray, Jr.,Linear Prediction of Speech, Springer, Berlin, 1976.

    Google Scholar 

  15. E. Parzen, On estimation of a probability density function and mode, Ann. Math. Statist. 33, 1962, 1065–1076.

    Google Scholar 

  16. T. Poggio and F. Girosi, Networks for approximation and learning, Proc. IEEE 78, 1990, 1481–1497.

    Article  Google Scholar 

  17. M. Reed and B. Simon,Methods of Modern Mathematical Physics, Vol. 1; Functional Analysis, Academic Press, San Diego, 1980.

    Google Scholar 

  18. J. Rissanen, Universal coding, information, prediction, and estimation, IEEE Trans. on Information Theory 30, 1984, 629–636.

    Article  Google Scholar 

  19. D.E. Rumelhart, J.L. McClelland and the PDP Research Group,Parallel Distributed Processing, Vol. 1, MIT Press, 1986.

  20. D.F. Specht, A general regression neural network, IEEE Trans. on Neural Networks 2, 1991, 568–576.

    Article  Google Scholar 

  21. S. Watanabe, Fourier analysis of neural networks, IEICE Technical Report NC91-63, 1991, pp. 111–118 (in Japanese).

  22. S. Watanabe, Differential equations accompanying neural networks and solvable nonlinear learning machines,Proc. of IJCNN (Nagoya), 1993, pp. 2698–2671.

  23. S. Watanabe, Solvable models of artifical neural networks,Advances in Neural Information Processing Systems, Vol. 6, Morgan Kaufmann, San Mateo, 1994, pp. 423–430.

    Google Scholar 

  24. H. White, Learning in artificial neural networks: a statistical perspective, Neural Computation 1, 1989, 425–464.

    Google Scholar 

  25. N. Wiener,The Fourier Integral and Certain of its Applications, Cambridge Univ. Press, 1933.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Watanabe, S. Solvable models of layered neural networks based on their differential structure. Adv Comput Math 5, 205–231 (1996). https://doi.org/10.1007/BF02124744

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02124744

Keywords

Navigation