Skip to main content
Log in

Modified reward function on abstract features in inverse reinforcement learning

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abbeel, P., Ng, A.Y., 2004. Apprenticeship Learning via Inverse Reinforcement Learning. Proc. 21st Int. Conf. on Machine Learning, p.1–8.

  • Abbeel, P., Ng, A.Y., 2005. Exploration and Apprenticeship Learning in Reinforcement Learning. Proc. 22nd Int. Conf. on Machine Learning, p.1–8. [doi:10.1145/1102351.1102352]

  • Abbeel, P., Dolgov, D., Ng, A.Y., Thrun, S., 2008. Apprenticeship Learning for Motion Planning with Application to Parking Lot Navigation. Proc. Int. Conf. on Intelligent Robots and Systems, p.1083–1090.

  • Amit, R., Mataric, M., 2002. Learning Movement Sequences from Demonstration. Proc. 2nd Int. Conf. on Development and Learning, p.203–208. [doi:10.1109/DEVLRN.2002.1011867]

  • Atkeson, C., Schaal, S., 1997. Robot Learning from Demonstration. Proc. 14th Int. Conf. on Machine Learning, p.12–20.

  • Coates, A., Abbeel, P., Ng, A.Y., 2009. Apprenticeship learning for helicopter control. Commun. ACM, 52(7):97–105. [doi:10.1145/1538788.1538812]

    Article  Google Scholar 

  • Hayes, G., Demiris, J., 1994. A Robot Controller Using Learning by Imitation. Proc. 2nd Int. Symp. on Intelligent Robotic Systems, p.198–204.

  • Kolter, J.Z., Abbeel, P., Ng, A.Y., 2008a. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion. Advances in Neural Information Processing Systems. MIT Press, Cambridge, p.769–776.

    Google Scholar 

  • Kolter, J.Z., Rodgers, M.P., Ng, A.Y., 2008b. A Complete Control Architecture for Quadruped Locomotion over Rough Terrain. Proc. Int. Conf. on Robotics and Automation, p.811–818.

  • Kuniyoshi, Y., Inaba, M., Inoue, H., 1994. Learning by watching: extracting reusable task knowledge from visual observation of human performance. IEEE Trans. Rob. Autom., 10(6):799–822. [doi:10.1109/70.338535]

    Article  Google Scholar 

  • Mitchell, T., 1997. Machine Learning. McGraw Hill, New York, p.385–392.

    MATH  Google Scholar 

  • Ng, A.Y., Russell, S., 2000. Algorithms for Inverse Reinforcement Learning. Proc.17th Int. Conf. on Machine Learning, p.663–670.

  • Ng, A.Y., Harada, D., Russell, S., 1999. Policy Invariance under Reward Transformations: Theory and Application to Reward Shaping. Proc. 16th Int. Conf. on Machine Learning, p.278–287.

  • Pomerleau, D., 1989. Alvinn: an Autonomous Land Vehicle in a Neural Network. Advances in Neural Information Processing Systems 1. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, p.305–313.

    Google Scholar 

  • Puterman, M., 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York, NY.

    MATH  Google Scholar 

  • Rebula, J.R., Neuhaus, P.D., Bonnlander, B.V., Johnson, M.J., Pratt, J.E., 2007. A Controller for the LittleDog Quadruped Walking on Rough Terrain. IEEE Int. Conf. on Robotics and Automation, p.1467–1473.

  • Russell, S., 1998. Learning Agents for Uncertain Environments. Proc. 11th Annual Conf. on Computational Learning Theory, p.101–103.

  • Sammut, C., Hurst, S., Kedzier, D., Michie, D., 1992. Learning to Fly. Proc. 9th Int. Workshop on Machine Learning, p.385–393.

  • Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning. MIT Press, USA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shen-yi Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Sy., Qian, H., Fan, J. et al. Modified reward function on abstract features in inverse reinforcement learning. J. Zhejiang Univ. - Sci. C 11, 718–723 (2010). https://doi.org/10.1631/jzus.C0910486

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C0910486

Key words

CLC number

Navigation