Skip to main content

Improved Vanishing Gradient Problem for Deep Multi-layer Neural Networks

  • Conference paper
  • First Online:
  • 621 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1787))

Abstract

Deep learning technologies have been broadly utilized in theoretical research and practical application of intelligent robots. Among numerous paradigms, BP neural network attracts wide attentions as an accurate and flexible tool. However, there always exists an unanswered question centering around gradient disappearance when using back-propagation strategy in multi-layer BP neural network. Moreover, the situation deteriorates sharply in the context of sigmoid transfer functions employed. To fill this research gap, this study explores a new solution that the relative magnitude of gradient descent is estimated, and neutralized via a new developed function with increasing properties. As a result, the undesired gradient disappearance problem is alleviated while reserving the traditional merits of the gradient descent method. The validity is verified by an actual case study of subway passenger flow, and the simulation results elucidate a superior convergence speed compared with the original algorithm.

This work was found by the National Natural Science Foundation of China under Grant 41771187. None of the material for this article has been published at the conference.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chen, Y., Wang, Y.C., Lan, S.L., Wang, L.H., Shen, W.M., Huang, G.Q.: Cloud-edge-device collaboration mechanisms of deep learning models for smart robots in mass personalization. Robot. Comput. Integr. Manuf. 77, 102351 (2022)

    Article  Google Scholar 

  2. Li, C.-B., Wang, K.-C.: A new grey forecasting model based on BP neural network and Markov chain. J. Cent. South Univ. Technol. 14(5), 713–718 (2007). https://doi.org/10.1007/s11771-007-0136-7

    Article  Google Scholar 

  3. Zhang, Y.G., Chen, B., Pan, G.F., Zhao, Y.: A novel hybrid model based on VMD-WT and PCA-BP-RBF neural network for short-term wind speed forecasting. Energy Convers. Manag. J. 195, 180–197 (2019)

    Article  Google Scholar 

  4. Saha, T.K., Pal, S., Sarkar, R.: Prediction of wetland area and depth using linear regression model and artificial neural network based cellular automata. Ecol. Inform. Energy Convers. Manag. 62, 101272 (2021)

    Google Scholar 

  5. Shao, Y.X., et al.: Prediction of 3-month treatment outcome of IgG4-DS based on BP artificial neural network. Oral Dis. 27(4), 934–941 (2021)

    Article  Google Scholar 

  6. Gurcan, C., Negash, B., Nathan, H.: Improved grey system models for predicting traffic parameters. Expert Syst. Appl. 177, 114972 (2021)

    Article  Google Scholar 

  7. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feed-forward neural networks. J. Mach. Learn. Res. 9, 249–256 (2010)

    Google Scholar 

  8. Zhang, S.R., Wang, B.T., Li, X.E., Chen, H.: Research and application of improved gas concentration prediction model based on grey theory and BP neural network in digital mine. Procedia CIRP 56, 471–475 (2016)

    Article  Google Scholar 

  9. Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Internat. J. Uncertain. Fuzziness Knowledge-Based Systems 6(2), 107–116 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  10. Apaydin, H., Feizi, H., Sattari, M.T., Colak, M.S., Shamshirband, S., Chau, K.W.: Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting. Water 12(5), 1500 (2020)

    Article  Google Scholar 

  11. Hahnloser, R.L.T.: On the piecewise analysis of networks of linear threshold neurons. Neural Netw. 11(4), 691–697 (1998)

    Article  Google Scholar 

  12. He, J.C., Li, L., Xu, J.C.: ReLU deep neural networks from the hierarchical basis perspective. Comput. Math. Appl. 120, 105–114 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  13. Qin, Y., Wang, X., Zou, J.Q.: The optimized deep belief networks with improved logistic Sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines. IEEE Trans. Industr. Electron. 66(5), 3814–3824 (2018)

    Article  Google Scholar 

  14. Wang, X., Qin, Y., Wang, Y., Xiang, S., Chen, H.Z.: ReLTanh: An activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis. Neurocomputing 363, 88–98 (2019)

    Article  Google Scholar 

  15. Roodschild, M., Sardiñas, J.G., Will, A.: A new approach for the vanishing gradient problem on sigmoid activation. Prog. Artif. Intell. 9(4), 351–360 (2020). https://doi.org/10.1007/s13748-020-00218-y

    Article  Google Scholar 

  16. Abuqaddom, I., Mahafzah, B.A., Faris, H.: Oriented stochastic loss descent algorithm to train very deep multi-layer neural networks without vanishing gradients. Knowl.-Based Syst. 230, 107391 (2021)

    Article  Google Scholar 

  17. Al-Abri, S., Lin, T.X., Tao, M., Zhang, F.M.: A derivative-free optimization method with application to functions with exploding and vanishing gradients. IEEE Control Syst. Lett. 5(2), 587–592 (2021)

    Article  MathSciNet  Google Scholar 

  18. Karabayir, I., Akbilgic, O., Tas, N.: A novel learning algorithm to optimize deep neural networks: Evolved gradient direction optimizer (EVGO). IEEE Trans. Neural Netw. Learn. Syst. 32(2), 685–694 (2020)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingqiu Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, D., Liu, X., Zhang, J. (2023). Improved Vanishing Gradient Problem for Deep Multi-layer Neural Networks. In: Sun, F., Cangelosi, A., Zhang, J., Yu, Y., Liu, H., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2022. Communications in Computer and Information Science, vol 1787. Springer, Singapore. https://doi.org/10.1007/978-981-99-0617-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-0617-8_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-0616-1

  • Online ISBN: 978-981-99-0617-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics