Abstract
Based on a novel algorithm, known as the upper-layer-solution-aware (USA), a new algorithm, in which the penalty method is introduced into the empirical risk, is studied for training feed-forward neural networks in this paper, named as USA with penalty. Both theoretical analysis and numerical results show that it can control the magnitude of weights of the networks. Moreover, the deterministic theoretical analysis of the new algorithm is proved. The monotonicity of the empirical risk with penalty term is guaranteed in the training procedure. The weak and strong convergence results indicate that the gradient of the total error function with respect to weights tends to zero, and the weight sequence goes to a fixed point when the iterations approach positive infinity. Numerical experiment has been implemented and effectively verifies the proved theoretical results.
Similar content being viewed by others
References
Bishop CM (1993) Neural networks for pattern recognition. MIT Press, Cambridge
Wan EA (1990) Neural network classification: a Bayesian interpretation. IEEE Trans Neural Netw 1(4):303–305
Eberhart RC, Shi Y (2007) Neural network concepts and paradigms. In: Computational intelligence. Elsevier, New York, pp 145–196. https://www.sciencedirect.com/book/9781558607590/computational-intelligence
Zhang K, Ma XP, Li YL, Wu HY, Cui CY, Zhang XM, Zhang H, Yao J (2018) Parameter prediction of hydraulic fracture for tight reservoir based on micro-seismic and history matching. Fractals 26(2):1–17
Huang G Bin, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867
Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3(2):246–257
Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. Dissertation, Harvard University
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. MIT Press, Cambridge
Goodband JH, Haas OC, Mills JA (2008) A comparison of neural network approaches for on-line prediction in IGRT. Med Phys 35(3):1113–1122
Huang G Bin, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE International joint conference on neural networks, pp 985–990
Huang GB, Siew CK (2005) Extreme learning machine with randomly assigned RBF kernels. Int J Inf Technol 11(1):16–24
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Li MB, Huang GB, Saratchandran P, Sundararajan N (2005) Fully complex extreme learning machine. Neurocomputing 68(1):306–314
Huang G Bin, Siew CK (2004) Extreme learning machine: RBF network case. In: Control, automation, robotics and vision conference, pp 1029–1036
Zhu WT, Miao J, Qing L (2014) Constrained extreme learning machine: a novel highly discriminative random feedforward neural network. In: International joint conference on neural networks. IEEE, pp 800–807
Zhu QY, Qin AK, Suganthan PN, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763
Ding S, Zhao H, Zhang Y, Xu X, Nie R (2015) Extreme learning machine: algorithm, theory and applications. Artif Intell Rev 44(1):103–115
Yu D, Deng L (2012) Efficient and effective algorithms for training single-hidden-layer neural networks. Pattern Recognit Lett 33(5):554–558
Huang GB, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468
Cao J, Lin Z, Huang GB (2012) Self-adaptive evolutionary extreme learning machine. Neural Process Lett 36(3):285–305
Huynh HT, Won Y, Kim JJ (2008) An improvement of extreme learning machine for compact single-hidden-layer feedforward neural networks. Int J Neural Syst 18(5):433–441
Han F, Yao HF, Ling QH (2013) An improved evolutionary extreme learning machine based on particle swarm optimization. Neurocomputing 116:87–93
Yu D, Tashev I (2014) Speech emotion recognition using deep neural network and extreme learning machine. In: Interspeech, pp 223–227
Wang Y, Li D, Yi D, Zhisong P (2015) Anomaly detection in traffic using L1-norm minimization extreme learning machine. Neurocomputing 149:415–425
Hinton GE (1989) Connectionist learning procedures. Artif Intell 40(13):185–234
Reed R (1993) Pruning algorithm—a survey. IEEE Trans Neural Netw 4(5):740–747
Ishikawa M (1996) Structural learning with forgetting. Neural Netw 9(3):509–521
Setiono R (1997) A penalty-function approach for pruning feedforward neural networks. Neural Comput 9(1):185–204
Haykin S (1994) Neural networks: a comprehensive foundation. Macmillan, New York
Tibshirani R (1994) Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288
Hoerl AE (1962) Application of ridge analysis to regression problems. Chem Eng Prog 58:54–59
Tychonoff AN (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math 4:1035–1038
Takase H, Kita H, Hayashi T (2003) Effect of regularization term upon fault tolerant training. In: International joint conference on neural networks, pp 1048–1053
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1):80–86
Sun W, Yuan YX (2001) Optimization theory and methods. Science Press, Beijing
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 12 Mar 2017
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (No. 61305075, 11604181), the Natural Science Foundation of Shandong Province (No. ZR2015AL014, ZR201709220208) and the Fundamental Research Funds for the Central Universities (No. 15CX08011A, 18CX02036A, 16CX02012A).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Wang, J., Zhang, B., Sang, Z. et al. Convergence of a modified gradient-based learning algorithm with penalty for single-hidden-layer feed-forward networks. Neural Comput & Applic 32, 2445–2456 (2020). https://doi.org/10.1007/s00521-018-3748-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3748-y