Swarm Reinforcement Learning Method Based on an Actor-Critic Method

Iima, Hitoshi; Kuroe, Yasuaki

doi:10.1007/978-3-642-17298-4_29

Hitoshi Iima²⁷ &
Yasuaki Kuroe²⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6457))

Included in the following conference series:

Asia-Pacific Conference on Simulated Evolution and Learning

2742 Accesses

Abstract

We recently proposed swarm reinforcement learning methods in which multiple agents are prepared and they learn not only by individual learning but also by learning through exchanging information among the agents. The methods have been applied to a problem in discrete state-action space so far, and Q-learning method has been used as the individual learning. Although many studies in reinforcement learning have been done for problems in the discrete state-action space, continuous state-action space is required for coping with most real-world tasks. This paper proposes a swarm reinforcement learning method based on an actor-critic method in order to acquire optimal policies rapidly for problems in the continuous state-action space. The proposed method is applied to an inverted pendulum control problem, and its performance is examined through numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Kennedy, J., Eberhart, R.C.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001)
Google Scholar
Iima, H., Kuroe, Y.: Reinforcement Learning through Interaction among Multiple Agents. In: SICE-ICASE International Joint Conference, pp. 2457–2462 (2006)
Google Scholar
Iima, H., Kuroe, Y.: Swarm Reinforcement Learning Algorithms Based on Particle Swarm Optimization. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 1110–1115 (2008)
Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-Learning. Machine Learning 8, 279–292 (1992)
MATH Google Scholar
Busoniu, L., Babuska, R., Schutter, B.D.: A Comprehensive Survey of Multiagent Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C 38, 156–172 (2008)
Article Google Scholar
Kimura, H., Kobayashi, S.: An Analysis of Actor/Critic Algorithms using Eligibility Traces: Reinforcement Learning with Imperfect Value Function. In: 15th International Conference on Machine Learning, pp. 278–286 (1998)
Google Scholar
Doya, K.: Reinforcement Learning in Continuous Time and Space. Neural Computation 12, 219–245 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto, Japan
Hitoshi Iima & Yasuaki Kuroe

Authors

Hitoshi Iima
View author publications
You can also search for this author in PubMed Google Scholar
Yasuaki Kuroe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mechanical Engineering, Indian Institute of Technology Kanpur, 208016, Kanpur, Uttar Pradesh, India
Kalyanmoy Deb
Department of Computer Science andl Engineering, Indian Institute of Technology Kanpur, 208016, Kanpur, Uttar Pradesh, India
Arnab Bhattacharya
Department of Metallurgical and Materials Engineering, Indian Institute of Technology Kharagpur, 712302, Kharagpur, West Bengal, India
Nirupam Chakraborti
Department of Civil Engineering, Indian Institute of Technology Kanpur, 208016, Kanpur, Uttar Pradesh, India
Partha Chakroborty & Ashu Jain &
Department of Electronics and Communication Engineering, Jadavpur University, 700032, Kolkata, West Bengal, India
Swagatam Das
Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, 208016, Kanpur, Uttar Pradesh, India
Joydeep Dutta
Department of Chemical Engineering, Indian Institute of Technology Kanpur, 208016, Kanpur, Uttar Pradesh, India
Santosh K. Gupta
Aspiring Minds, 24 Pusa Road, 110005, New Delhi, India
Varun Aggarwal
Warwick Business School, University of Warwick, CV4 7AL, Coventry, UK
Jürgen Branke
Department of Computer Science and Engineering, University of Nevada, 89557, Reno, NV, USA
Sushil J. Louis
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117576, Singapore
Kay Chen Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iima, H., Kuroe, Y. (2010). Swarm Reinforcement Learning Method Based on an Actor-Critic Method. In: Deb, K., et al. Simulated Evolution and Learning. SEAL 2010. Lecture Notes in Computer Science, vol 6457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17298-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-17298-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17297-7
Online ISBN: 978-3-642-17298-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics