Skip to main content

Combining reinforcement learning and differential inverse kinematics for collision-free motion of multilink manipulators

  • Neural Networks for Communications, Control and Robotics
  • Conference paper
  • First Online:
Biological and Artificial Computation: From Neuroscience to Technology (IWANN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

Abstract

This paper presents a class of neural controllers that learn goal-oriented obstacle-avoiding strategies for multilink manipulators. They acquire these strategies on-line through reinforcement learning from local sensory data. These controllers are mainly made of two neural modules: a reinforcement-based action generator and a module for differential inverse kinematics (DIV). The action generator generates actions with regard to a goal vector in the manipulator joint space. Suitable goal vectors are provided by the DIV module. This module is based on the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics in polar coordinates. Results for two- and three-link planar manipulators are shown. These controllers achieve a good performance quite rapidly and exhibit good generalization capabilities in the face of new environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baginski, B., Eldracher, M.: Path planning with neural subgoal search. IEEE World Congress on Computational Intelligence (1994) 2732–2736.

    Google Scholar 

  2. Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. on SMC 13 (1983) 835–846.

    Google Scholar 

  3. Barto, A.G., Sutton, R.S., Watkins, C.J.: Learning and sequential decision making. Tech. Report 89-95, University of Massachusetts, Amherst (1989).

    Google Scholar 

  4. Jordan, M.I.: Forward models: Supervised learning with a distal teacher. Cognitive Science 16 (1992) 307–354.

    Google Scholar 

  5. Kindermann, J., Linden, A.: Inversion of neural networks by gradient descent. Journal of Parallel Computing 14 (1992) 277–286.

    Google Scholar 

  6. Lee, K., Kil, R. M.: Redundant arm kinematic control with Recurrent Loop. Neural Networks, vol. 7, 4 (1994) 643–659.

    Google Scholar 

  7. Martín, P., Millán, J. del R.: Learning goal-directed obstacle-avoiding strategies through reinforcement for a two-link sensor-based manipulator. Tech. Report I.96.138, ISIS, JRC-EC, Ispra (1996).

    Google Scholar 

  8. Millán, J. del R.: Rapid, safe, and incremental learning of navigation strategies. IEEE Trans. on SMC, Part B 26 (1996).

    Google Scholar 

  9. Millán, J. del R., Torras, C.: A reinforcement connectionist approach to robot path finding in non-maze-like environments. Machine Learning 8 (1992) 363–395.

    Google Scholar 

  10. Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning, 3 (1988) 9–44.

    Google Scholar 

  11. Tham, C.K., Prager, R.W.: A modular Q-learning architecture for manipulator task decomposition. 11th Int. Conf. on Machine Learning (1994) 309–317.

    Google Scholar 

  12. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8 (1992) 229–256.

    Google Scholar 

  13. Whitney, D.: The mathematics of coordinated control of prosthetic arms and manipulators. ASME Journal of Dynamic Systems, Mathematics, and Control 94 (1972) 303–309

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Martin, P., Millán, J.d.R. (1997). Combining reinforcement learning and differential inverse kinematics for collision-free motion of multilink manipulators. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032593

Download citation

  • DOI: https://doi.org/10.1007/BFb0032593

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63047-0

  • Online ISBN: 978-3-540-69074-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics