A Reinforcement Learning Method with Implicit Critics from a Bystander

Hwang, Kao-Shing; Hsieh, Chi-Wei; Jiang, Wei-Cheng; Lin, Jin-Ling

doi:10.1007/978-3-319-59072-1_43

Kao-Shing Hwang¹⁶,
Chi-Wei Hsieh¹⁶,
Wei-Cheng Jiang¹⁶ &
…
Jin-Ling Lin¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

International Symposium on Neural Networks

2507 Accesses
1 Citations

Abstract

In Reinforcement Learning, we train agent many times, so agents can get experience from learning, and then, agent can complete every behavior of different missions. In this paper, we propose architecture to allow agent get experience from environment. We use Adaptive Heuristic Critic (AHC) as a learning architecture and combine an action bias with AHC to solve the problem of continuous action system. On account of the problems of recognition error and state delay, we use Reinforcement Learning which learns from cumulative reward to update the experience of agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Prioritized objective actor-critic method for deep reinforcement learning

Article 19 February 2021

Combine Deep Q-Networks with Actor-Critic

Applying Online Expert Supervision in Deep Actor-Critic Reinforcement Learning

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Kaebling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Google Scholar
Ayesh, A.: Emotionally motivated reinforcement learning based controller. IEEE Int. Conf. Syst. Man Cybernet. 1, 874–878 (2004)
Google Scholar
Broekens, J.: Emotion and reinforcement: affective facial expressions facilitate robot learning. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds.) Artifical Intelligence for Human Computing. LNCS, vol. 4451, pp. 113–132. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72348-6_6
Chapter Google Scholar
Obayashi, M., Takuno, T., Kuremoto, T., Kobayashi, K.: An emotional model embedded reinforcement learning system. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (2012)
Google Scholar
Sridharan, M.: Augmented Reinforcement learning for interaction with non-expert humans in agent domains. In: 2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA), vol. 1 (2011)
Google Scholar
Thomaz, A.L., Hoffman, G., Breazeal, C.: Reinforcement learning with human teachers: understanding how people want to teach robots. In: The 15th IEEE International Symposium on Robot and Human Interactive Communication, September 2006
Google Scholar
Knox, W.B., Stone, P.: TAMER: training an agent manually via evaluative reinforcement. In: ICDL 2008 7th IEEE International Conference on Development and Learning (2008)
Google Scholar
Rosenthal, S., Biswas, J., Veloso, M.: An effective personal mobile robot agent through symbiotic human-robot interaction. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 915–922 (2010)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, Cambridge University (1989)
Google Scholar
Batro, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13, 834–846 (1993)
Google Scholar
Sun, Y., Zhang, R.B., Zhang, Y.: Research on adaptive heuristic critic algorithms and its applications. In: Proceedings of the 4th World Congress on Intelligent Control and Automation, vol. 1, pp. 345–349 (2002)
Google Scholar
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems (2000)
Google Scholar
Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real valued functions. Neural Netw. 3, 671–692 (1990)
Article Google Scholar
Gullapalli, V.: Associative reinforcement learning of real valued functions. In: Proceedings of IEEE, System, Man, Cybernetics, Charlottesville, VA, October 1991
Google Scholar
Widrow, B., Lehr, M.A.: 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc. IEEE 78, 1415–1442 (1990)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Hinton, G., Osindero, S., The, Y.: A fast learning algorithm for deep belief nets. Neural Comput. (2006)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. Arch. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. JMLR Workshop Conf. Proc. 27, 37–50 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan
Kao-Shing Hwang, Chi-Wei Hsieh & Wei-Cheng Jiang
Information Management, Shih Hsin University, Taipei, Taiwan
Jin-Ling Lin

Authors

Kao-Shing Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Wei Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Cheng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Ling Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei-Cheng Jiang .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hwang, KS., Hsieh, CW., Jiang, WC., Lin, JL. (2017). A Reinforcement Learning Method with Implicit Critics from a Bystander. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-59072-1_43
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics