Abstract
In Reinforcement Learning, we train agent many times, so agents can get experience from learning, and then, agent can complete every behavior of different missions. In this paper, we propose architecture to allow agent get experience from environment. We use Adaptive Heuristic Critic (AHC) as a learning architecture and combine an action bias with AHC to solve the problem of continuous action system. On account of the problems of recognition error and state delay, we use Reinforcement Learning which learns from cumulative reward to update the experience of agents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Kaebling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Ayesh, A.: Emotionally motivated reinforcement learning based controller. IEEE Int. Conf. Syst. Man Cybernet. 1, 874–878 (2004)
Broekens, J.: Emotion and reinforcement: affective facial expressions facilitate robot learning. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds.) Artifical Intelligence for Human Computing. LNCS, vol. 4451, pp. 113–132. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72348-6_6
Obayashi, M., Takuno, T., Kuremoto, T., Kobayashi, K.: An emotional model embedded reinforcement learning system. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (2012)
Sridharan, M.: Augmented Reinforcement learning for interaction with non-expert humans in agent domains. In: 2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA), vol. 1 (2011)
Thomaz, A.L., Hoffman, G., Breazeal, C.: Reinforcement learning with human teachers: understanding how people want to teach robots. In: The 15th IEEE International Symposium on Robot and Human Interactive Communication, September 2006
Knox, W.B., Stone, P.: TAMER: training an agent manually via evaluative reinforcement. In: ICDL 2008 7th IEEE International Conference on Development and Learning (2008)
Rosenthal, S., Biswas, J., Veloso, M.: An effective personal mobile robot agent through symbiotic human-robot interaction. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 915–922 (2010)
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, Cambridge University (1989)
Batro, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13, 834–846 (1993)
Sun, Y., Zhang, R.B., Zhang, Y.: Research on adaptive heuristic critic algorithms and its applications. In: Proceedings of the 4th World Congress on Intelligent Control and Automation, vol. 1, pp. 345–349 (2002)
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems (2000)
Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real valued functions. Neural Netw. 3, 671–692 (1990)
Gullapalli, V.: Associative reinforcement learning of real valued functions. In: Proceedings of IEEE, System, Man, Cybernetics, Charlottesville, VA, October 1991
Widrow, B., Lehr, M.A.: 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc. IEEE 78, 1415–1442 (1990)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Hinton, G., Osindero, S., The, Y.: A fast learning algorithm for deep belief nets. Neural Comput. (2006)
Vincent, P., Larochelle, H., Lajoie, I.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. Arch. 11, 3371–3408 (2010)
Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. JMLR Workshop Conf. Proc. 27, 37–50 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Hwang, KS., Hsieh, CW., Jiang, WC., Lin, JL. (2017). A Reinforcement Learning Method with Implicit Critics from a Bystander. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-59072-1_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)