A Novel Adaptive Tropism Reward ADHDP Method with Robust Property

Chen, Jing; Li, Zongshuai

doi:10.1007/978-3-642-38786-9_33

Jing Chen²² &
Zongshuai Li²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7888))

Included in the following conference series:

International Conference on Brain Inspired Cognitive Systems

2272 Accesses
4 Citations

Abstract

According to the autonomous learning problem for the two-wheeled self-balancing robot, a novel adaptive tropism reward ADHDP with robust property was proposed, which can get the online adaptive tropism reward information. The whole learning system used a form of three networks, including action neural networks (ANN), adaptive tropism reward neural networks (ATRNN) and critic neural networks (CNN). The design of adaptive tropism reward neural networks took example from the learning mechanism of actor-critic structure. And through the primary binary reward signal, the continuous secondary reward signal can be got adaptively and become the basis of critic neural networks learning. Through the simulation in two-wheeled self-balancing robot, we can conclude that the proposed learning mechanism is effective and has a better progressive learning property. The optimal learning performance is got finally. Through the comparison of statistical experiment, it can be found that the proposed method has a certain anti-noise ability and the robust learning performance is better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wang, Z.-Y., Dai, Y.-P., Li, Y.-W., et al.: A kind of utility function in adaptive dynamic programming for inverted pendulum control. In: 2010 International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1538–1543 (2010)
Google Scholar
Doya, K.: Efficient nonlinear control with actor–tutor architecture. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 1012–1018. MIT Press, Cambridge (1997)
Google Scholar
Doya, K.: Reinforcement Learning in Continuous Time and Space. Neural Computation 12(1), 219–245 (2000)
Article Google Scholar
Li, X., Yang, Y., Xu, X.: Multiagent AGV dispatching system based on hierarchical reinforcement learning. Control and Decision 17(3), 292–296 (2002)
Google Scholar
He, H., Ni, Z., Fu, J.: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming. Neurocomputing 78(1), 3–13 (2012)
Article Google Scholar
Liu, B., Li, S., Lou, Y., et al.: A hierarchical learning architecture with multiple-goal representations and multiple timescale based on approximate dynamic programming. Neural Computing & Applications, 1–17 (2012)
Google Scholar
Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition 113(3), 262–280 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology Engineering, Tianjin University of Technology and Education, 300222, Tianjin, China
Jing Chen
Aeronautical Automation College, Civil Aviation University of China, 300300, Tianjin, China
Zongshuai Li

Authors

Jing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zongshuai Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Institute of Automation, The State Key Laboratory of Management and Control for Complex Systems, 100190, Beijing, China
Derong Liu & Dongbin Zhao &
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milano, Italy
Cesare Alippi
Department of Computing Science and Mathematics, University of Stirling, FK9 4LA, Stirling, UK
Amir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Li, Z. (2013). A Novel Adaptive Tropism Reward ADHDP Method with Robust Property. In: Liu, D., Alippi, C., Zhao, D., Hussain, A. (eds) Advances in Brain Inspired Cognitive Systems. BICS 2013. Lecture Notes in Computer Science(), vol 7888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38786-9_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-38786-9_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38785-2
Online ISBN: 978-3-642-38786-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics