Abstract
We consider the problem of approximating the Sobolev class of functions by neural networks with a single hidden layer, establishing both upper and lower bounds. The upper bound uses a probabilistic approach, based on the Radon and wavelet transforms, and yields similar rates to those derived recently under more restrictive conditions on the activation function. Moreover, the construction using the Radon and wavelet transforms seems very natural to the problem. Additionally, geometrical arguments are used to establish lower bounds for two types of commonly used activation functions. The results demonstrate the tightness of the bounds, up to a factor logarithmic in the number of nodes of the neural network.
Similar content being viewed by others
References
R.A. Adams, Sobolev Spaces (Academic Press, New York, 1975).
A.R. Barron, Universal approximation bounds for superposition of sigmoidal function, IEEE Trans. Inform. Theory 39 (1993) 930–945.
T. Chen, H. Chen and R. Liu, Approximation capability in C( _Rn) by multilayer feedforward networks and related problems, IEEE Trans. Neural Networks 6(1) (1995) 25–30.
C.K. Chui, X. Li and H.N. Mhaskar, Some limitations of neural networks with one hidden layer, Adv. Comput. Math. 5 (1996) 233–244.
G. Cybenko, Approximation by superposition of sigmoidal functions, Math. Control Signals Systems 2 (1989) 303–314.
R.A. DeVore, R. Howard and C.A. Micchelli, Optimal nonlinear approximation, Manuscripta Math. 63 (1989) 469–478.
R.A. DeVore, K. Oskolkov, P. Petrushev, Approximation by feed-forward neural networks, Ann. Numer. Math. 4 (1997) 261–287.
I. Daubechies, Ten Lectures on Wavelets (SIAM Press, 1992).
B. Delyon, A. Juditsky and A. Benveniste, Accuracy analysis for wavelet approximations, IEEE Trans. Neural Networks 6 (1995) 332–348.
F. Girosi, Regularization theory, radial basis functions and networks, in: From Statistics to Neural Networks, eds. V. Cherkassy, J.H. Friedman and H. Wechsler (Springer, 1994) pp. 166–187.
P. Hall and C.C. Heyde, Martingale Limit Theory and Its Applications (Academic Press, Orlando, 1980).
E. Hernandez and G.L. Weiss, First Course on Wavelets (CRC Press, 1996).
S. Helgason, The Radon Transform (Birkh¨auser, Boston, 1980).
M. Karpinski and A.J. Macintyre, Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks, J. Comput. System Sci. 54 (1997) 1600–176.
V.Ya. Lin and A. Pinkus, Fundamentality of ridge functions, J. Approx. Theory 75 (1993) 295–311.
V. Maiorov, On best approximation by ridge functions, J. Approx. Theory 99 (1999) 68–94.
V. Maiorov and J. Ratsaby, On the degree of approximation using manifolds of finite pseudodimension, J. Construct. Approx. 15 (1999) 291–300.
V. Maiorov and A. Pinkus, Lower bounds for approximation by MLP neural networks, Neurocomputing 25 (1998) 81–91.
R. Meir and V. Maiorov, On the optimality of neural network approximation using incremental algorithms, Technical Report CC-257, Department of Electrical Engineering, Technion, Israel, October 1998. To appear in IEEE Trans. Neural Networks (2000).
H.N. Mhaskar, Neural networks for optimal approximation of smooth and analytic functions, Neural Comput. 8 (1996) 164–177.
H.N. Mhaskar and C.A. Micchelli, Approximation by superposition of a sigmoidal function and radial basis functions, Adv. Appl. Math. 16 (1992) 350–373.
H.N. Mhaskar and C.A. Micchelli, Dimension independent bounds on the degree of approximation by neural networks, IBM J. Res. Develop. 38 (1994) 277–284.
O.P. Misra and J.L. Lavoine, Transform Analysis of Generalized Functions, North-Holland Mathematics Studies, Vol. 119 (North-Holland, Amsterdam, 1986).
P.P. Petrushev, Approximation by ridge functions and neural networks, SIAM J. Math. Anal. 30 (1998) 155–189.
A. Pinkus, n-Widths in Approximation Theory (Springer, Berlin, 1985).
E. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces (Princeton Univ. Press, Princeton, NJ, 1971).
H. Triebel, Interpolation Theory of Function Spaces and Differential Operators (VEB Verlag, Berlin, 1978).
M. Vidyasagar, A Theory of Learning and Generalization (Springer, London, 1997).
V. Vapnik, The Nature of Statistical Learning Theory (Springer, New York, 1995).
A.G. Vitushkin, Estimation of the Complexity of the Tabulation Problem (Fizmatgiz, Moscow, 1959).
H.E. Warren, Lower bounds for approximation by nonlinear manifold, Trans. Amer. Math. Soc. 133 (1968) 167–178.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Maiorov, V., Meir, R. On the near optimality of the stochastic approximation of smooth functions by neural networks. Advances in Computational Mathematics 13, 79–103 (2000). https://doi.org/10.1023/A:1018993908478
Issue Date:
DOI: https://doi.org/10.1023/A:1018993908478