Modified reward function on abstract features in inverse reinforcement learning

Chen, Shen-yi; Qian, Hui; Fan, Jia; Jin, Zhuo-jun; Zhu, Miao-liang

doi:10.1631/jzus.C0910486

Modified reward function on abstract features in inverse reinforcement learning

Published: 02 August 2010

Volume 11, pages 718–723, (2010)
Cite this article

Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Shen-yi Chen¹,
Hui Qian¹,
Jia Fan¹,
Zhuo-jun Jin¹ &
…
Miao-liang Zhu¹

152 Accesses
6 Citations
Explore all metrics

Abstract

We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quo vadis artificial intelligence?

Article Open access 07 March 2022

A survey of transfer learning

Article Open access 28 May 2016

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

References

Abbeel, P., Ng, A.Y., 2004. Apprenticeship Learning via Inverse Reinforcement Learning. Proc. 21st Int. Conf. on Machine Learning, p.1–8.
Abbeel, P., Ng, A.Y., 2005. Exploration and Apprenticeship Learning in Reinforcement Learning. Proc. 22nd Int. Conf. on Machine Learning, p.1–8. [doi:10.1145/1102351.1102352]
Abbeel, P., Dolgov, D., Ng, A.Y., Thrun, S., 2008. Apprenticeship Learning for Motion Planning with Application to Parking Lot Navigation. Proc. Int. Conf. on Intelligent Robots and Systems, p.1083–1090.
Amit, R., Mataric, M., 2002. Learning Movement Sequences from Demonstration. Proc. 2nd Int. Conf. on Development and Learning, p.203–208. [doi:10.1109/DEVLRN.2002.1011867]
Atkeson, C., Schaal, S., 1997. Robot Learning from Demonstration. Proc. 14th Int. Conf. on Machine Learning, p.12–20.
Coates, A., Abbeel, P., Ng, A.Y., 2009. Apprenticeship learning for helicopter control. Commun. ACM, 52(7):97–105. [doi:10.1145/1538788.1538812]
Article Google Scholar
Hayes, G., Demiris, J., 1994. A Robot Controller Using Learning by Imitation. Proc. 2nd Int. Symp. on Intelligent Robotic Systems, p.198–204.
Kolter, J.Z., Abbeel, P., Ng, A.Y., 2008a. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion. Advances in Neural Information Processing Systems. MIT Press, Cambridge, p.769–776.
Google Scholar
Kolter, J.Z., Rodgers, M.P., Ng, A.Y., 2008b. A Complete Control Architecture for Quadruped Locomotion over Rough Terrain. Proc. Int. Conf. on Robotics and Automation, p.811–818.
Kuniyoshi, Y., Inaba, M., Inoue, H., 1994. Learning by watching: extracting reusable task knowledge from visual observation of human performance. IEEE Trans. Rob. Autom., 10(6):799–822. [doi:10.1109/70.338535]
Article Google Scholar
Mitchell, T., 1997. Machine Learning. McGraw Hill, New York, p.385–392.
MATH Google Scholar
Ng, A.Y., Russell, S., 2000. Algorithms for Inverse Reinforcement Learning. Proc.17th Int. Conf. on Machine Learning, p.663–670.
Ng, A.Y., Harada, D., Russell, S., 1999. Policy Invariance under Reward Transformations: Theory and Application to Reward Shaping. Proc. 16th Int. Conf. on Machine Learning, p.278–287.
Pomerleau, D., 1989. Alvinn: an Autonomous Land Vehicle in a Neural Network. Advances in Neural Information Processing Systems 1. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, p.305–313.
Google Scholar
Puterman, M., 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York, NY.
MATH Google Scholar
Rebula, J.R., Neuhaus, P.D., Bonnlander, B.V., Johnson, M.J., Pratt, J.E., 2007. A Controller for the LittleDog Quadruped Walking on Rough Terrain. IEEE Int. Conf. on Robotics and Automation, p.1467–1473.
Russell, S., 1998. Learning Agents for Uncertain Environments. Proc. 11th Annual Conf. on Computational Learning Theory, p.101–103.
Sammut, C., Hurst, S., Kedzier, D., Michie, D., 1992. Learning to Fly. Proc. 9th Int. Workshop on Machine Learning, p.385–393.
Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning. MIT Press, USA.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Shen-yi Chen, Hui Qian, Jia Fan, Zhuo-jun Jin & Miao-liang Zhu

Authors

Shen-yi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hui Qian
View author publications
You can also search for this author in PubMed Google Scholar
Jia Fan
View author publications
You can also search for this author in PubMed Google Scholar
Zhuo-jun Jin
View author publications
You can also search for this author in PubMed Google Scholar
Miao-liang Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shen-yi Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Sy., Qian, H., Fan, J. et al. Modified reward function on abstract features in inverse reinforcement learning. J. Zhejiang Univ. - Sci. C 11, 718–723 (2010). https://doi.org/10.1631/jzus.C0910486

Download citation

Received: 07 August 2009
Revised: 19 October 2009
Published: 02 August 2010
Issue Date: September 2010
DOI: https://doi.org/10.1631/jzus.C0910486

Key words

CLC number

TP181

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modified reward function on abstract features in inverse reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Quo vadis artificial intelligence?

A survey of transfer learning

Multi-agent deep reinforcement learning: a survey

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation