Skip to main content

Multistrategy Learning for Robot Behaviours

  • Chapter
Advances in Machine Learning I

Part of the book series: Studies in Computational Intelligence ((SCI,volume 262))

Abstract

Pure reinforcement learning does not scale well to domains with many degrees of freedom and particularly to continuous domains. In this paper, we introduce a hybrid method in which a symbolic planner constructs an approximate solution to a control problem. Subsequently, a numerical optimisation algorithm is used to refine the qualitative plan into an operational policy. The method is demonstrated on the problem of learning a stable walking gait for a bipedal robot. We use this approach to illustrate the benefits of a multistrategy approach to robot learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apt, K.R., Wallace, M.: Constraint Logic Programming Using Eclipse. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  2. Benson, S., Nilsson, N.J.: Reacting, planning and learning in an autonomous agent. In: Furukawa, K., Michie, D., Muggleton, S. (eds.) Machine Intelligence, vol. 14. Oxford University Press, Oxford (1995)

    Google Scholar 

  3. Dietterich, T.G.: The MAXQ method for hierarchical reinforcement learning. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 118–126. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  4. Durrant-whyte, H., Bailey, T.: Simultaneous Localisation and Mapping (SLAM): Part I: The Essential Algorithms. Robotics and Automation Magazine 13, 99–110 (2006)

    Article  Google Scholar 

  5. Dzeroski, S., De Raedt, L., Blockeel, H.: Relational reinforcement learning. In: Page, D.L. (ed.) ILP 1998. LNCS (LNAI), vol. 1446. Springer, Heidelberg (1998)

    Google Scholar 

  6. Ferrein, A., Lakemeyer, G.: Logic-based robot control in highly dynamic domains. Robotics and Autonomous Systems 56(11), 980–991 (2008)

    Article  Google Scholar 

  7. Fikes, R., Nilsson, N.: STRIPS: a new approach to the application of theorem proving to problem solving. Artificial Intelligence 2, 189–208 (1971)

    Article  MATH  Google Scholar 

  8. Getoor, L., Taskar, B. (eds.): Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)

    MATH  Google Scholar 

  9. Hengst, B.: Discovering Hierarchy in Reinforcement Learning with HEXQ. In: Sammut, C. (ed.) Proceedings of the International Conference on Machine Learning, Sydney (2002)

    Google Scholar 

  10. Hornby, G.S., Fujita, M., Takamura, S., Yamamoto, T., Hanagata, O.: Evolution of gaits with the sony quadruped robot. In: Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann Publishers Inc., San Francisco (1999)

    Google Scholar 

  11. Kim, M.S., Uther, W.: Automatic gait optimisation for quadruped robots. In: Australasian Conference on Robotics and Automation, Brisbane (2003)

    Google Scholar 

  12. Laird, J., Rosenbloom, P., Newell, A.: Soar: An Architecture for General Intelligence. Artificial Intelligence 33, 1–64 (1987)

    Article  Google Scholar 

  13. Langley, P., Choi, D.: A unified cognitive architecture for physical agents. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence. AAAI Press, Boston (2006)

    Google Scholar 

  14. Leonard, J.J., Durrant-whyte, H.F.: Simultaneous map building and localization for an autonomous mobilerobot. In: Intelligent Robots and Systems 1991. Intelligence for Mechanical Systems, Proceedings IROS 1991. IEEE/RSJ International Workshop, pp. 1442–1447 (1991)

    Google Scholar 

  15. Michalski, R.S.: LEARNING = INFERENCING + MEMORIZING: Basic Concepts of Inferential Theory of Learning and Their Use for Classifying Learning Processes. In: Chipman, S. (ed.) Cognitive Models of Learning (1992)

    Google Scholar 

  16. Michalski, R.S.: Inferential Theory of Learning as a Conceptual Basis for Multistrategy Learning. Machine Learning, Special Issue on Multistrategy Learning 11, 111–151 (1993)

    MathSciNet  Google Scholar 

  17. Michalski, R.S.: Toward a Unified Theory of Learning: Multistrategy Task-adaptive Learning. In: Buchanan, B.G., Wikins, D.C. (eds.) Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  18. Michalski, R.S.: Inferential Theory of Learning: Developing Foundations for Multistrategy Learning. In: Machine Learning: A Multistrategy Approach, vol. IV. Morgan Kaufmann Publishers, San Francisco (1994)

    Google Scholar 

  19. Michie, D., Chambers, R.A.: Boxes: An Experiment in Adaptive Control. In: Dale, E., Michie, D. (eds.) Machine Intelligence, vol. 2. Oliver and Boyd, Edinburgh (1968)

    Google Scholar 

  20. Mitchell, T.M., Keller, R.M., Kedar-Cabelli, S.T.: Explanation-Based Generalization: A Unifying View. Machine Learning 1(1), 47–80 (1986)

    Google Scholar 

  21. Kohl, N., Stone, P.: Policy gradient reinforcement learning for fast quadrupedal locomotion. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2619–2624 (2004)

    Google Scholar 

  22. Nardi, D., Riedmiller, M., Sammut, C., Santos-Victor, J. (eds.): RoboCup 2004. LNCS (LNAI), vol. 3276. Springer, Heidelberg (2005)

    Google Scholar 

  23. Ogino, M., Katoh, Y., Asada, M., Hosoda, K.: Vision-Based Reinforcement Learning for Humanoid Behavior Generation with Rhythmic Walking Parameters. In: Proceedings of 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 2, pp. 1665–1671 (2003)

    Google Scholar 

  24. Ogino, M., Hosoda, K., Asada, M.: Learning Energy Efficient Walking with Ballistic Walking. In: 2nd International Symposium on Adaptive Motion of Animals and Machines (2003)

    Google Scholar 

  25. Ramos, F.T., Durrant-Whyte, H.F., Upcroft, B.: Learning Articulated Motion Structures with Bayesian Networks. In: 8th International Conference on Information Fusion, Philadelphia (2005)

    Google Scholar 

  26. Ryan, M.R.K.: Using Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies. In: Sammut, C. (ed.) Proceedings of The 19th International Conference on Machine Learning, Sydney (2002)

    Google Scholar 

  27. Sammut, C.A., Hume, D.V.: Observation and Generalisation in a Simulated Robot World. In: Proceedings of the Fourth International Machine Learning Workshop, Los Altos, California (1987)

    Google Scholar 

  28. Sammut, C., Hengst, B.: The Evolution of a Robot Soccer Team. In: Jarvis, R.A., Zelinksky, A. (eds.) Robotics Research: The Tenth International Conference, pp. 517–529. Springer, Heidelberg (2003)

    Google Scholar 

  29. Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement Learning for RoboCup-Soccer Keepaway. Adaptive Behavior 13(3), 165–188 (2005)

    Article  Google Scholar 

  30. Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. MIT Press, Cambridge (2005)

    MATH  Google Scholar 

  31. Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  32. Watkins, C.J.C.H.: Learning with Delayed Rewards. Ph.D. Dissertation, Psychology Department, University of Cambridge, England (1989)

    Google Scholar 

  33. Wyeth, G., Kee, D., Yik, T.F.: Evolving a Locus Based Gait for a Humanoid Robot. In: International Conference on Robotics and Intelligent Systems (2003)

    Google Scholar 

  34. Yik, T.K.: Locomotion of Bipedal Humanoid Robots: Planning and Learning to Walk. Ph.D. Dissertation, School of Computer Science and Engineering, Universty of New South Wales (2008)

    Google Scholar 

  35. Zhou, C., Yue, P.K., Ni, J., Chan, S.-B.: Dynamically stable gait planning for a humanoid robot to climb sloping surface. In: Proceedings of IEEE Conference on Robotics, Automation and Mechatronics, pp. 341–346 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sammut, C., Yik, T.F. (2010). Multistrategy Learning for Robot Behaviours. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05177-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05176-0

  • Online ISBN: 978-3-642-05177-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics