Skip to main content

A Reinforcement Learning Method with Implicit Critics from a Bystander

  • Conference paper
  • First Online:
Advances in Neural Networks - ISNN 2017 (ISNN 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

Abstract

In Reinforcement Learning, we train agent many times, so agents can get experience from learning, and then, agent can complete every behavior of different missions. In this paper, we propose architecture to allow agent get experience from environment. We use Adaptive Heuristic Critic (AHC) as a learning architecture and combine an action bias with AHC to solve the problem of continuous action system. On account of the problems of recognition error and state delay, we use Reinforcement Learning which learns from cumulative reward to update the experience of agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Kaebling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)

    Google Scholar 

  3. Ayesh, A.: Emotionally motivated reinforcement learning based controller. IEEE Int. Conf. Syst. Man Cybernet. 1, 874–878 (2004)

    Google Scholar 

  4. Broekens, J.: Emotion and reinforcement: affective facial expressions facilitate robot learning. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds.) Artifical Intelligence for Human Computing. LNCS, vol. 4451, pp. 113–132. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72348-6_6

    Chapter  Google Scholar 

  5. Obayashi, M., Takuno, T., Kuremoto, T., Kobayashi, K.: An emotional model embedded reinforcement learning system. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (2012)

    Google Scholar 

  6. Sridharan, M.: Augmented Reinforcement learning for interaction with non-expert humans in agent domains. In: 2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA), vol. 1 (2011)

    Google Scholar 

  7. Thomaz, A.L., Hoffman, G., Breazeal, C.: Reinforcement learning with human teachers: understanding how people want to teach robots. In: The 15th IEEE International Symposium on Robot and Human Interactive Communication, September 2006

    Google Scholar 

  8. Knox, W.B., Stone, P.: TAMER: training an agent manually via evaluative reinforcement. In: ICDL 2008 7th IEEE International Conference on Development and Learning (2008)

    Google Scholar 

  9. Rosenthal, S., Biswas, J., Veloso, M.: An effective personal mobile robot agent through symbiotic human-robot interaction. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 915–922 (2010)

    Google Scholar 

  10. Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, Cambridge University (1989)

    Google Scholar 

  11. Batro, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13, 834–846 (1993)

    Google Scholar 

  12. Sun, Y., Zhang, R.B., Zhang, Y.: Research on adaptive heuristic critic algorithms and its applications. In: Proceedings of the 4th World Congress on Intelligent Control and Automation, vol. 1, pp. 345–349 (2002)

    Google Scholar 

  13. Konda, V., Tsitsiklis, J.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems (2000)

    Google Scholar 

  14. Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real valued functions. Neural Netw. 3, 671–692 (1990)

    Article  Google Scholar 

  15. Gullapalli, V.: Associative reinforcement learning of real valued functions. In: Proceedings of IEEE, System, Man, Cybernetics, Charlottesville, VA, October 1991

    Google Scholar 

  16. Widrow, B., Lehr, M.A.: 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc. IEEE 78, 1415–1442 (1990)

    Article  Google Scholar 

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  18. Hinton, G., Osindero, S., The, Y.: A fast learning algorithm for deep belief nets. Neural Comput. (2006)

    Google Scholar 

  19. Vincent, P., Larochelle, H., Lajoie, I.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. Arch. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  20. Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. JMLR Workshop Conf. Proc. 27, 37–50 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei-Cheng Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hwang, KS., Hsieh, CW., Jiang, WC., Lin, JL. (2017). A Reinforcement Learning Method with Implicit Critics from a Bystander. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59072-1_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59071-4

  • Online ISBN: 978-3-319-59072-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics