Skip to main content
Log in

The Convergence of Incremental Neural Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The investigation of neural network convergence represents a pivotal and indispensable area of research, as it plays a crucial role in unraveling the universal approximation capability and the intricate structural complexity inherent in these systems. In this study, we delve into an innovative and generalized convex incremental iteration method, which surpasses previous studies by offering a more expansive formulation capable of encompassing a broader range of weight parameters. Moreover, we rigorously and systematically demonstrate the convergence rate of this convex iteration technique, shedding light on its reliability and effectiveness. Furthermore, we adopt a discrete statistical perspective to effectively tackle the challenges arising from the non-compactness of input data and the inherent unknowability of the objective function in practical settings, thereby enhancing the robustness and applicability of our research. To support our conclusions, we introduce two implementation algorithms, namely back propagation and random search. The latter algorithm plays a vital role in preventing the neural network from becoming entrapped in suboptimal local minima during the training process. Finally, we present comprehensive results obtained from a variety of regression problems, which not only serve as empirical evidence of the superior performance of our algorithms but also validate their alignment with our theoretical predictions. These results contribute significantly to the advancement of our understanding of neural network convergence and its profound implications for the universal approximation capability inherent in these complex systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS

  2. Simonyan K, Zisserman A (2012) Very deep convolutional networks for large-scale image recognition. In: Computer science

  3. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR

  4. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366

    Article  Google Scholar 

  5. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314

    Article  MathSciNet  Google Scholar 

  6. Ito Y (1991) Approximation of functions on a compact set by finite sums of a sigmoid function without scaling. Neural Netw 4:817–826

    Article  Google Scholar 

  7. Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257

    Article  Google Scholar 

  8. Hornik K (1993) Some new results on neural network approximation. Neural Netw 6:1069–1072

    Article  Google Scholar 

  9. Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approxiamate any function. Neural Netw 6:861–867

    Article  Google Scholar 

  10. Chen T, Chen H, Liu R-W (1995) Approximation capability in \(C(\overline{ R}^n)\) by multilayer feedforward networks and related problems. IEEE Trans Neural Netw 6(1):25–30

    Article  Google Scholar 

  11. Chen T, Chen H (1995) Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks. IEEE Trans Neural Netw 6(4):904–910

    Article  Google Scholar 

  12. Lee WS, Bartlett PL, Williamson RC (1996) Efficient agnostic learning of neural networks with bounded fan-in. IEEE Trans Inf Theory 42(6):2118–2132

    Article  MathSciNet  Google Scholar 

  13. Maiorov V, Pinkus A (1999) Lower bounds for approximation by MLP neural networks. Neurocomputing 25:81–91

    Article  Google Scholar 

  14. Meir R, Maiorov VE (2000) On the optimality of neural-network approximation using incremental algorithms. IEEE Trans Neural Netw 11(2):323–337

    Article  Google Scholar 

  15. Lavretsky E (2002) On the geometric convergence of neural approximations. IEEE Trans Neural Netw 13(2):274–282

    Article  Google Scholar 

  16. Xiang C, Shenqiang Lee TH (2005) Geometrical interpretation and architecture selection of MLP. IEEE Trans Neural Netw 16(1):84–96

    Article  Google Scholar 

  17. Huang G-B, Chen L, Siew C-K (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  18. Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70:3056–3062

    Article  Google Scholar 

  19. Lee J, Xiao L, Schoenholz S, Bahri Y, Novak R, Sohl-Dickstein J, Pennington J (2019) Wide neural networks of any depth evolve as linear models under gradient descent. In: NeurIPS

  20. Jones LK (1992) A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural networks. Ann Stat 20(1):608–613

    Article  MathSciNet  Google Scholar 

  21. Barron AR (1993) Universal approximation bounds for superpositions of a sigmoid function. IEEE Trans Inf Theory 39(3):930–945

    Article  MathSciNet  Google Scholar 

  22. Koiran P (1994) Efficient learning of continuous neural networks. In: Proceedings of the 7th annual ACM conference on computational learning theory, pp 348–355

  23. Donahue MJ, Gurvits L, Darken C, Sontag E (1997) Rates of convex approximation in non-Hilbert spaces. Constr Approx 13:187–220

    Article  MathSciNet  Google Scholar 

  24. Kwok T-Y, Yeung D-Y (1997) Objective functions for training new hidden units in constructive neural networks. IEEE Trans Neural Netw 8(5):1131–1148

    Article  Google Scholar 

  25. Romero E (2002) A new incremental method for function approximation using feed-forward neural networks. In: Proceedings of INNS-IEEE international joint conference on neural networks (IJCNN’2002), pp 1968–1973

  26. Chen L, Huang G-B, Pung HK (2009) Systemical convergence rate analysis of convex incremental feedforward neural networks. Neurocomputing 72:2627–2635

    Article  Google Scholar 

  27. Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theorey and applications. Neurocomputing 70:489–501

    Article  Google Scholar 

  28. Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71:3460–3468

    Article  Google Scholar 

  29. Zhang R, Lan Y, Huang G-B, Xu Z (2012) Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Trans Neural Netw Learn Syst 23(2):365–371

    Article  Google Scholar 

  30. Zhang R, Lan Y, Huang G-B, Xu Z, Soh YC (2013) Dynamic extreme learning machine and its approximation capability. IEEE Trans Cybern 43(6):2054–2065

    Article  Google Scholar 

  31. Elbrächter D, Perekrestenko D, Grohs P, Bölcskei H (2021) Deep neural network approximation theory. IEEE Trans Inf Theory 67(5):2581–2623

    Article  MathSciNet  Google Scholar 

  32. Lu J, Shen Z, Yang H, Zhang S (2021) Deep network approximation for smooth functions. J Math Anal 53(5):5465–5506

    MathSciNet  Google Scholar 

  33. Mhaskar H, Poggio T (2016) Deep vs. shallow networks: an approximation theory perspective. Anal Appl 14(6):829–848

    Article  MathSciNet  Google Scholar 

  34. Shen Z, Yang H, Zhang S (2022) Optimal approximation rate of relu networks in terms of width and depth. Journal de Mathématiques Pures et Appliquées 157:101–135

    Article  MathSciNet  Google Scholar 

  35. Yarotsky D (2018) Optimal approximation of continuous functions by very deep relu networks. In: 31st Annual conference on learning theory, pp 639–649

  36. Yarotsky D, Zhevnerchuk A (2020) The phase diagram of approximation rates for deep neural networks. In: NIPS

  37. Xu Y, Zhang H (2022) Convergence of deep convolutional neural networks. Neural Netw 153:553–563

    Article  Google Scholar 

  38. Kaminski W, Strumillo P (1997) Kernel orthonormalization in radial basis function neural networks. IEEE Trans Neural Netw 8(5):1177–1183

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Chen.

Ethics declarations

Conflict of interest

All authors have declared that: (i) no support, financial or otherwise, has been received from any organization that may have an interest in the submitted work; and (ii) there are no other relationships or activities that could appear to have influenced the submitted work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Wang, Y., Zhang, L. et al. The Convergence of Incremental Neural Networks. Neural Process Lett 55, 12481–12499 (2023). https://doi.org/10.1007/s11063-023-11429-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11429-4

Keywords

Navigation