Principled Methods for Biasing Reinforcement Learning Agents

Li, Zhi; Hu, Kun; Liu, Zengrong; Yu, Xueli

doi:10.1007/978-3-642-23887-1_89

Zhi Li²³,
Kun Hu²³,
Zengrong Liu²³ &
…
Xueli Yu²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7003))

Included in the following conference series:

International Conference on Artificial Intelligence and Computational Intelligence

2123 Accesses

Abstract

Reinforcement learning (RL) is a powerful technique for learning in domains where there is no instructive feedback but only evaluative feedback and is rapidly expanding in industrial and research fields. One of the main limitations of RL is the slowness in convergence. Thus, several methods have been proposed to speed up RL. They involve the incorporation of prior knowledge or bias into RL. In this paper, we present a new method for incorporating bias into RL. This method extends the choosing initial Q-values method proposed by Hailu G. and Sommer G. and one kind of learning mechanism is introduced into agent. This allows for much more specific information to guide the agent which action to choose and meanwhile it is helpful to reduce the state research space. So it improves the learning performance and speed up the convergence of the learning process greatly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Gabriel, M., Moore, J.W. (eds.): Learning and Computational Neuroscience. MIT Press, Cambridge; Mam, Y.: The Technical Writer’s Handbook. University Science, Mill Valley (1989)
Google Scholar
Barto, A.G., Sutton, R.S., Watkins, C.J.C.H.: Learning and sequential decision making. In: Gabriel, M., Moore, J.W. (eds.) Learning and Computational Neuroscience. The MIT Press, Cambridge (1990)
Google Scholar
Hailu, G., Sommer, G.: Embedding knowledge in reinforcement learning. In: International Conference on Artificial Neural Network (ICANN), Sweden, pp. 1133–1138 (1998)
Google Scholar
Malak, R.J., Kholsa, P.K.: A framework for the adaptive transfer of robot skill knowledge among reinforcement learning agents. In: IEEE International Conference on Robotic Automation (2001)
Google Scholar
Wiewiora, E., Cottrell, G., Elkan, C.: Principled Methods for Advising Reinforcement Learning Agents. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), Washington DC (2003)
Google Scholar
Perkins, T., Barto, A.: Lyapunov design for safe reinforcement learning control. In: Machine Learning, Proceedings of the Sixteenth International Conference. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Hailu, G., Sommer, G.: On Amount and Quality of Bias in Reinforcement Learning. In: IEEE International Conference on Systems, Man and Cybernetics (IEEE SMC 1999), Tokyo, Japan, pp. 1491–1495 (1999)
Google Scholar
Watkins, C.: Learning from delayed rewards. Ph.D. dissertation. Cambridge University, Cambridge, England (1989)
Google Scholar
Watkins, C., Dayan, P.: Technical note: Q-learning. Machine Learning 8, 279–292 (1992)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Taiyuan University of Technology, 79 Yingze West Street, Taiyuan City, Shanxi Province, China
Zhi Li, Kun Hu, Zengrong Liu & Xueli Yu

Authors

Zhi Li
View author publications
You can also search for this author in PubMed Google Scholar
Kun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zengrong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xueli Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Business Information Technology, RMIT University, City Campus, 124 La Trobe Street, 3000, Melbourne, VIC, Australia
Hepu Deng
School of Electronics and Information, Tongji University, 201804, Shanghai, China
Duoqian Miao
School of Computer and Information Engineering, Shanghai University of Electric Power, 200090, Shanghai, China
Jingsheng Lei
Department of Business Administration, Caritas Institute of Higher Education, 18 Chui Ling Road, Tseung Kwan O, Hong Kong, China
Fu Lee Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Hu, K., Liu, Z., Yu, X. (2011). Principled Methods for Biasing Reinforcement Learning Agents. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23887-1_89

Download citation

DOI: https://doi.org/10.1007/978-3-642-23887-1_89
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23886-4
Online ISBN: 978-3-642-23887-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics