Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning

Rungger, Matthias; Ding, Hao; Stursberg, Olaf

doi:10.1007/978-3-642-02565-5_17

Matthias Rungger²³,
Hao Ding²³ &
Olaf Stursberg²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5499))

Included in the following conference series:

Workshop on Anticipatory Behavior in Adaptive Learning Systems

1426 Accesses

Abstract

In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within this paper, we combine both principles by proposing a two-level hierarchical reinforcement learning scheme to enable the system to autonomously determine suitable solutions to new tasks. The approach is based on a behavior representation specified by hybrid automata, which combines continuous and discrete behavior, to predict (anticipate) the outcome of a sequence of actions. On the higher layer of the hierarchical scheme, the behavior is abstracted in the form of finite state automata, on which value function iteration is performed to obtain a goal leading sequence of subtasks. This sequence is realized on the lower layer by applying policy gradient-based reinforcement learning to the hybrid automaton model. The iteration between both layers leads to a consistent and goal-attaining behavior, as shown for a simple robot grasping task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning and planning with logical automata

Article Open access 13 August 2021

A Reinforcement Learning Based Algorithm for Robot Action Planning

Hierarchical Reinforcement Learning Under Mixed Observability

References

Arkin, R.C.: An Behavior-based Robotics. MIT Press, Cambridge (1998)
Google Scholar
Baird, L.: Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 30–37 (1995)
Google Scholar
Bertsekas, D.P., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
MATH Google Scholar
Branicky, M.S.: Behavioral Programming. In: Working Notes AAAI Spring Symp. on Hybrid Systems and AI (1999)
Google Scholar
Butz, M.V., Sigaud, O., Gérard, P.: Anticipatory Behavior: Exploiting Knowledge About the Future to Improve Current Behavior. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 2684, pp. 1–10. Springer, Heidelberg (2003)
Chapter Google Scholar
Butz, M.V., Sigaud, O., Gérard, P.: Internal Models and Anticipations in Adaptive Learning Systems. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 2684, pp. 86–109. Springer, Heidelberg (2003)
Chapter Google Scholar
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
MathSciNet MATH Google Scholar
Ding, H., Rungger, M., Stursberg, O.: Intelligent Planning of Manufacturing Systems with Hybrid Dynamics. In: IFAC Conf. on Manufacturing Modeling, Management, and Control, pp. 181–186 (2007)
Google Scholar
Doya, K.: Reinforcement learning in continuous time and space. Neural Comput. 12(1), 219–245 (2000)
Article Google Scholar
Egerstedt, M.: Behavior Based Robotics Using Hybrid Automata. In: Lynch, N.A., Krogh, B.H. (eds.) HSCC 2000. LNCS, vol. 1790, pp. 103–116. Springer, Heidelberg (2000)
Chapter Google Scholar
Henzinger, T.: The Theory of Hybrid Automata. In: Proceedings of the 11th Annual IEEE Symposium on Logic in Computer Science (LICS 1996), pp. 278–292 (1996)
Google Scholar
Mataric, M.J.: Reward functions for accelerated learning. In: Proc. of the 11th Int. Conf. on Machine Learning, pp. 181–189. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Tejas, R.: Mehta and Magnus Egerstedt. Multi-modal control using adaptive motion description languages. Automatica 44, 1912–1917 (2008)
Article MATH Google Scholar
Morimoto, J., Doya, K.: Acquisition of stand-up behavior by a real robot using hierarchical RL. Robotics and Autonomous Systems 36(1), 37–51 (2001)
Article MATH Google Scholar
Parr, R., Russell, S.: Russell Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems, vol. 10, pp. 1043–1049. The MIT Press, Cambridge (1997)
Google Scholar
Pirjanian, P.: Multiple objective behavior-based control 31, 53–60 (2000)
Google Scholar
Precup, D., Sutton, R.S., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)
Article MathSciNet MATH Google Scholar
Rungger, M., Stursberg, O., Spanfelner, B., Leuxner, C., Sitou, W.: Efficient Planning of Autonomous Robots using Hierarchical Composition. In: 5th Int. Conf. on Informatics, Control, Automation, Robotics, pp. 262–267 (2008)
Google Scholar
Mohajerian, P., Schaal, S., Ijspeert, A.: Dynamics Systems vs. Optimal Control – A Unifying View, ch. 27, pp. 425–445. Elsevier, Amsterdam (2007)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automatic Control Engineering, Technische Universität München, D-80290, Munich, Germany
Matthias Rungger, Hao Ding & Olaf Stursberg

Authors

Matthias Rungger
View author publications
You can also search for this author in PubMed Google Scholar
Hao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Olaf Stursberg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Consiglio Nazionale delle Ricerche,Istituto di Linguistica Computazionale “Antonio Zampolli”, Via Giuseppe Moruzzi, 1 - 56124 Pisa, Italy and Consiglio Nazionale delle Ricerche,Istituto di Scienze e Tecnologie della Cognizione, Via San Martino della Ba, Italy
Giovanni Pezzulo
COBOSLAB – Cognitive Bodyspaces: Learning and Behavior, Department of Psychology III, University of Würzburg, Röntgenring 11, 97070, Würzburg, Germany
Martin V. Butz
Institut des Systèmes Intelligents et de Robotique (CNRS UMR 7222), Université Pierre et Marie Curie, Pyramide Tour 55, 4 Place Jussieu, 75252, Paris Cedex 05, France
Olivier Sigaud
Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie della Cognizione, Laboratory of Computational Embodied Neuroscience, Via San Martino della Battaglia 44, 00185, Roma, Italy
Gianluca Baldassarre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rungger, M., Ding, H., Stursberg, O. (2009). Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning. In: Pezzulo, G., Butz, M.V., Sigaud, O., Baldassarre, G. (eds) Anticipatory Behavior in Adaptive Learning Systems. ABiALS 2008. Lecture Notes in Computer Science(), vol 5499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02565-5_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-02565-5_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02564-8
Online ISBN: 978-3-642-02565-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning and planning with logical automata

A Reinforcement Learning Based Algorithm for Robot Action Planning

Hierarchical Reinforcement Learning Under Mixed Observability

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning and planning with logical automata

A Reinforcement Learning Based Algorithm for Robot Action Planning

Hierarchical Reinforcement Learning Under Mixed Observability

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation