Abstract
This paper is devoted to the analysis of network approximation in the framework of approximation and regularization theory. It is shown that training neural networks and similar network approximation techniques are equivalent to least-squares collocation for a corresponding integral equation with mollified data.
Results about convergence and convergence rates for exact data are derived based upon well-known convergence results about least-squares collocation. Finally, the stability properties with respect to errors in the data are examined and stability bounds are obtained, which yield rules for the choice of the number of network elements.
Similar content being viewed by others
References
U. Amato and D.T. Vuza, Besov regularization, thresholding and wavelets for smoothing data, Numer. Funct. Anal. Optim. 18(1997) 461–493.
K.E. Atkinson and I.H. Sloan, The numerical solution of first-kind logarithmic-kernel integral equations on smooth open arcs, Math. Comp. 56(1991) 119–139.
A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39(1993) 930–945.
C.M. Bishop, Neural Networks for Pattern Recognition (Clarendon Press, Oxford, 1995).
C.M. Bishop, Regularization and complexity control in feed-forward networks, EC2 & Cie(1995) 141–148.
M. Burger and A. Neubauer, Error bounds for approximation with neural networks, submitted.
C.K. Chui and X. Li, Approximation by ridge functions and neural networks with one hidden layer, J. Approx. Theory 70(1992) 131–141.
I. Daubechies, Ten Lectures on Wavelets (SIAM, Philadelphia, PA, 1992).
H.W. Engl, Regularization by least-squares collocation, in: Numerical Treatment of Inverse Problems in Differential and Integral Equations, eds. P. Deuflhard and E. Hairer (Birkhäuser, Boston, 1983) pp. 345–354.
H.W. Engl, M. Hanke and A. Neubauer, Regularization of Inverse Problems (Kluwer, Dordrecht, 1996).
H.G. Feichtinger and T. Strohmer, eds., Gabor Analysis and Applications (Birkhäuser, Boston, 1998).
W. Freeden and F. Schneider, Regularization wavelets and multiresolution, Inverse Problems 14 (1998) 225–243.
F. Girosi and G. Anzellotti, Convergence rates of approximation by translates, AI Memo 1288, AI Laboratory, MIT, Cambridge, MA (1995).
F. Girosi and T. Poggio, A theory of networks for approximation and learning, AI Memo 1140, AI Laboratory, MIT, Cambridge, MA (1989).
F. Girosi and T. Poggio, Networks and the best approximation property, Biol. Cybern. 63(1990) 169–176.
F. Girosi, M. Jones and T. Poggio, Regularization theory and neural networks architectures, Neural Comput. 7(1995) 219–269.
C.W. Groetsch, Generalized inverses and generalized splines, Numer. Funct. Anal. Optim. 2(1980) 93–97.
W. Hackbusch, Integral Equations (Birkhäuser, Basel, 1995).
M. Hanke and O. Scherzer, Numerical differentiation as an example for inverse problems, Preprint 98/16, University Karlsruhe (1998).
M. Hegland and R.S. Anderssen, A mollification framework for improperly posed problems, Numer. Math. 78(1998) 549–575.
K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2(1989) 359–366.
R. Kress, Linear Integral Equations, 2nd edition (Springer, New York, 1999).
A.K. Louis, Inverse und Schlecht Gestellte Probleme (Teubner, Stuttgart, 1989).
A.K. Louis and P. Maaß, Smoothed projection methods for the moment problem, Numer. Math. 59 (1991) 277–294.
A.K. Louis, Approximate inverse for linear and some nonlinear problems, Inverse Problems 12(1996) 175–190.
Y. Makovoz, Uniform approximation by neural networks, J. Approx. Theory 95(1998) 215–228.
Y. Meyer, Wavelets. Algorithms and Applications (SIAM, Philadelphia, PA, 1993).
D.A. Murio, The Mollification Method and the Numerical Solution of Ill-posed Problems (Wiley, New York, 1993).
H.N. Mhaskar and C.A. Micchelli, Degree of approximation by neural and translation networks with a single hidden layer, Adv. Appl. Math. 16(1995) 151–183.
I.T. Nabney, Efficient training of RBF networks for classification, NCRG Technical Report 99/02 (1999).
M.Z. Nashed and G. Wahba, Convergence rates of approximate least squares solutions to linear integral and operator equations of the first kind, Math. Comp. 28(1974) 69–80.
F. Natterer, Regularisierung schlecht gestellter Probleme durch Projektionsverfahren, Numer. Math. 28(1977) 329-341.
P. Niyogi and F. Girosi, Generalization bounds for function approximation from scattered noisy data, Adv. Comput. Math. 10(1999) 51–80.
A. Sard, Approximations based on nonscalar observations, J. Approx. Theory 8(1973) 315–334.
J. Sjöberg, Q. Zhang, L. Ljung, A. Benveniste, B. Deylon, P.Y. Glorennec, H. Hjalmarsson and A. Juditsky, Nonlinear black-box modeling in system identification: A unified overview, Automatica 31 (1995) 1691–1724.
A.N. Tikhonov and V.Y. Arsenin, Solutions of Inverse Problems (Wiley, New York, 1977).
G. Wahba, Convergence rates of certain approximate solutions to fredholm integral equations of the first kind, J. Approx. Theory 7(1973) 167–185.
G. Wahba, A class of approximate solutions to linear operator equations, J. Approx. Theory 9(1973) 61–77.
Y. Xu and Y. Zhao, Quadratures for boundary integral equations of the first kind with logarithmic kernels, J. Integral Equations Appl. 8(1996) 239–268.
P. Yee and S. Haykin, Pattern classification as an ill-posed, inverse problem: a regularization approach, in: Proc. ICASSP-93 (Minneapolis, 1993).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Burger, M., Engl, H.W. Training neural networks with noisy data as an ill-posed problem. Advances in Computational Mathematics 13, 335–354 (2000). https://doi.org/10.1023/A:1016641629556
Issue Date:
DOI: https://doi.org/10.1023/A:1016641629556