Improved Vanishing Gradient Problem for Deep Multi-layer Neural Networks

Wang, Di; Liu, Xia; Zhang, Jingqiu

doi:10.1007/978-981-99-0617-8_12

Di Wang¹¹,
Xia Liu¹¹ &
Jingqiu Zhang¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1787))

Included in the following conference series:

International Conference on Cognitive Systems and Signal Processing

1002 Accesses

Abstract

Deep learning technologies have been broadly utilized in theoretical research and practical application of intelligent robots. Among numerous paradigms, BP neural network attracts wide attentions as an accurate and flexible tool. However, there always exists an unanswered question centering around gradient disappearance when using back-propagation strategy in multi-layer BP neural network. Moreover, the situation deteriorates sharply in the context of sigmoid transfer functions employed. To fill this research gap, this study explores a new solution that the relative magnitude of gradient descent is estimated, and neutralized via a new developed function with increasing properties. As a result, the undesired gradient disappearance problem is alleviated while reserving the traditional merits of the gradient descent method. The validity is verified by an actual case study of subway passenger flow, and the simulation results elucidate a superior convergence speed compared with the original algorithm.

This work was found by the National Natural Science Foundation of China under Grant 41771187. None of the material for this article has been published at the conference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Improved Conjugate Gradient Neural Networks Based on a Generalized Armijo Search Method

ResNet: Solving Vanishing Gradient in Deep Networks

A novel softplus linear unit for deep convolutional neural networks

Article 01 September 2017

References

Chen, Y., Wang, Y.C., Lan, S.L., Wang, L.H., Shen, W.M., Huang, G.Q.: Cloud-edge-device collaboration mechanisms of deep learning models for smart robots in mass personalization. Robot. Comput. Integr. Manuf. 77, 102351 (2022)
Article Google Scholar
Li, C.-B., Wang, K.-C.: A new grey forecasting model based on BP neural network and Markov chain. J. Cent. South Univ. Technol. 14(5), 713–718 (2007). https://doi.org/10.1007/s11771-007-0136-7
Article Google Scholar
Zhang, Y.G., Chen, B., Pan, G.F., Zhao, Y.: A novel hybrid model based on VMD-WT and PCA-BP-RBF neural network for short-term wind speed forecasting. Energy Convers. Manag. J. 195, 180–197 (2019)
Article Google Scholar
Saha, T.K., Pal, S., Sarkar, R.: Prediction of wetland area and depth using linear regression model and artificial neural network based cellular automata. Ecol. Inform. Energy Convers. Manag. 62, 101272 (2021)
Google Scholar
Shao, Y.X., et al.: Prediction of 3-month treatment outcome of IgG4-DS based on BP artificial neural network. Oral Dis. 27(4), 934–941 (2021)
Article Google Scholar
Gurcan, C., Negash, B., Nathan, H.: Improved grey system models for predicting traffic parameters. Expert Syst. Appl. 177, 114972 (2021)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feed-forward neural networks. J. Mach. Learn. Res. 9, 249–256 (2010)
Google Scholar
Zhang, S.R., Wang, B.T., Li, X.E., Chen, H.: Research and application of improved gas concentration prediction model based on grey theory and BP neural network in digital mine. Procedia CIRP 56, 471–475 (2016)
Article Google Scholar
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Internat. J. Uncertain. Fuzziness Knowledge-Based Systems 6(2), 107–116 (1998)
Article MathSciNet MATH Google Scholar
Apaydin, H., Feizi, H., Sattari, M.T., Colak, M.S., Shamshirband, S., Chau, K.W.: Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting. Water 12(5), 1500 (2020)
Article Google Scholar
Hahnloser, R.L.T.: On the piecewise analysis of networks of linear threshold neurons. Neural Netw. 11(4), 691–697 (1998)
Article Google Scholar
He, J.C., Li, L., Xu, J.C.: ReLU deep neural networks from the hierarchical basis perspective. Comput. Math. Appl. 120, 105–114 (2022)
Article MathSciNet MATH Google Scholar
Qin, Y., Wang, X., Zou, J.Q.: The optimized deep belief networks with improved logistic Sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines. IEEE Trans. Industr. Electron. 66(5), 3814–3824 (2018)
Article Google Scholar
Wang, X., Qin, Y., Wang, Y., Xiang, S., Chen, H.Z.: ReLTanh: An activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis. Neurocomputing 363, 88–98 (2019)
Article Google Scholar
Roodschild, M., Sardiñas, J.G., Will, A.: A new approach for the vanishing gradient problem on sigmoid activation. Prog. Artif. Intell. 9(4), 351–360 (2020). https://doi.org/10.1007/s13748-020-00218-y
Article Google Scholar
Abuqaddom, I., Mahafzah, B.A., Faris, H.: Oriented stochastic loss descent algorithm to train very deep multi-layer neural networks without vanishing gradients. Knowl.-Based Syst. 230, 107391 (2021)
Article Google Scholar
Al-Abri, S., Lin, T.X., Tao, M., Zhang, F.M.: A derivative-free optimization method with application to functions with exploding and vanishing gradients. IEEE Control Syst. Lett. 5(2), 587–592 (2021)
Article MathSciNet Google Scholar
Karabayir, I., Akbilgic, O., Tas, N.: A novel learning algorithm to optimize deep neural networks: Evolved gradient direction optimizer (EVGO). IEEE Trans. Neural Netw. Learn. Syst. 32(2), 685–694 (2020)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Di Wang & Xia Liu
College of Applied Arts and Science, Beijing Union University, Beijing, 100191, China
Jingqiu Zhang

Authors

Di Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xia Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jingqiu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingqiu Zhang .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fuchun Sun
University of Manchester, Manchester, UK
Angelo Cangelosi
Universität Hamburg, Hamburg, Germany
Jianwei Zhang
Fuzhou University, Fuzhou, China
Yuanlong Yu
Tsinghua University, Beijing, China
Huaping Liu
Tsinghua University, Beijing, China
Bin Fang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, D., Liu, X., Zhang, J. (2023). Improved Vanishing Gradient Problem for Deep Multi-layer Neural Networks. In: Sun, F., Cangelosi, A., Zhang, J., Yu, Y., Liu, H., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2022. Communications in Computer and Information Science, vol 1787. Springer, Singapore. https://doi.org/10.1007/978-981-99-0617-8_12

Download citation

DOI: https://doi.org/10.1007/978-981-99-0617-8_12
Published: 24 February 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0616-1
Online ISBN: 978-981-99-0617-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics