Skip to main content

An Improved Reinforcement Learning Based Heuristic Dynamic Programming Algorithm for Model-Free Optimal Control

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Abstract

For complicated processing industrial area, model-free adaptive control in data-driven schema is a classic problem. This paper proposes an improved reinforcement learning (RL) based heuristic dynamic programming algorithm for optimal tracking control in industrial system. The proposed method designs a double neural networks framework and employs a gradient-based optimization schema to present the optimal control law. Inspired by the experience replay buffer in deep RL learning, historical system trajectories in short-term are also considered in the training phase which achieves the stabilization of network learning. An experimental study based on an simulated industrial device shows that the proposed method is superior to other algorithms in terms of time consumption and control accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Shen, Y., Hao, L., Ding, S.X.: Real-time implementation of fault tolerant control systems with performance optimization. IEEE Trans. Ind. Electron. 61(5), 2402–2411 (2014)

    Article  Google Scholar 

  2. Kouro, S., Cortes, P., Vargas, R., Ammann, U., Rodriguez, J.: Model predictive control - a simple and powerful method to control power converters. IEEE Trans. Ind. Electron. 56(6), 1826–1838 (2009)

    Article  Google Scholar 

  3. Dai, W., Chai, T., Yang, S.X.: Data-driven optimization control for safety operation of hematite grinding process. IEEE Trans. Ind. Electron. 62(5), 2930–2941 (2015)

    Article  Google Scholar 

  4. Wang, D., Liu, D., Zhang, Q., Zhao, D.: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics. IEEE Trans. Syst. Man Cybern. Syst. 46(11), 1544–1555 (2016)

    Article  Google Scholar 

  5. Wei, Q.-L., Liu, D.-R.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)

    Article  MathSciNet  Google Scholar 

  6. Jiang, Y., Fan, J.-L., Chai, T.-Y., Li, J.-N., Lewis, L.F.: Data-driven flotation industrial process operational optimal control based on reinforcement learning. IEEE Trans. Ind. Inform. 14(5), 1974–1989 (2017)

    Article  Google Scholar 

  7. Jiang, Y., Fan, J.-L., Chai, T.-Y., Lewis, L.F.: Dual-rate operational optimal control for flotation industrial process with unknown operational model. IEEE Trans. Ind. Electron. 66(6), 4587–4599 (2019)

    Article  Google Scholar 

  8. Modares, H., Lewis, F.L.: Automatica integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input. Automatica 50(1), 193–202 (2014)

    Article  MathSciNet  Google Scholar 

  9. Mnih, V., Silver, D., Riedmiller, M.: Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop 2013, Lake Tahoe, USA NIPS, pp. 1–9 (2013)

    Google Scholar 

  10. Wang, D., Liu, D.-R., Wei, Q.-L., Zhao, D.-B., Jin, N.: Automatica optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica 48(8), 1825–1832 (2012)

    Article  MathSciNet  Google Scholar 

  11. Chai, T.-Y., Jia, Y., Li, H.-B., Wang, H.: An intelligent switching control for a mixed separation thickener process. Control Eng. Pract. 57, 61–71 (2016)

    Article  Google Scholar 

  12. Kim, B.H., Klima, M.S.: Development and application of a dynamic model for hindered-settling column separations. Miner. Eng. 17(3), 403–410 (2004)

    Article  Google Scholar 

  13. Wang, L.-Y., Jia, Y., Chai, T.-Y., Xie, W.-F.: Dual rate adaptive control for mixed separation thickening process using compensation signal based approach. IEEE Trans. Ind. Electron. 1 (2017)

    Google Scholar 

  14. Wang, M.: Design and development of model software of processes of slurry neutralization, sedimentation and separation. Northeastern University (2011)

    Google Scholar 

  15. Tang, M.-T.: Hydrometallurgical equipment. Central South University (2009)

    Google Scholar 

  16. Lin-Yan, W., Jian, L., Yao, J., Tian-You, C.: Dual-rate intelligent switching control for mixed separation thickening process. Acta Automatica Sinica 44(2), 330–343 (2018)

    Google Scholar 

  17. Luo, B., Liu, D.-R., Huang, T.-W., Wang, D.: Model-free optimal tracking control via critic-only Q-learning. IEEE Trans. Neural Netw. Learn. Syst. 27(10), 2134–2144 (2016)

    Article  MathSciNet  Google Scholar 

  18. Padhi, R., Unnikrishnan, N., Wang, X.-H., Balakrishnan, S.N.: A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Netw. 19(10), 1648–1660 (2006)

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded by the National Key Research and Development Program of China with grant number (No. 2019YFC0605300 and No. 2016YFB0700500), the National Natural Science Foundation of China with grant number (No. 61572075, No. 61702036 and No. 61873299) and Key Research Plan of Hainan Province (No. ZDYF2019009).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojuan Ban .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Yuan, Z., Ban, X. (2020). An Improved Reinforcement Learning Based Heuristic Dynamic Programming Algorithm for Model-Free Optimal Control. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61616-8_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61615-1

  • Online ISBN: 978-3-030-61616-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics