Combining reinforcement learning and differential inverse kinematics for collision-free motion of multilink manipulators

Martin, Pedro; Millán, José del R.

doi:10.1007/BFb0032593

Pedro Martin¹ &
José del R. Millán²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

237 Accesses
2 Citations

Abstract

This paper presents a class of neural controllers that learn goal-oriented obstacle-avoiding strategies for multilink manipulators. They acquire these strategies on-line through reinforcement learning from local sensory data. These controllers are mainly made of two neural modules: a reinforcement-based action generator and a module for differential inverse kinematics (DIV). The action generator generates actions with regard to a goal vector in the manipulator joint space. Suitable goal vectors are provided by the DIV module. This module is based on the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics in polar coordinates. Results for two- and three-link planar manipulators are shown. These controllers achieve a good performance quite rapidly and exhibit good generalization capabilities in the face of new environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baginski, B., Eldracher, M.: Path planning with neural subgoal search. IEEE World Congress on Computational Intelligence (1994) 2732–2736.
Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. on SMC 13 (1983) 835–846.
Google Scholar
Barto, A.G., Sutton, R.S., Watkins, C.J.: Learning and sequential decision making. Tech. Report 89-95, University of Massachusetts, Amherst (1989).
Google Scholar
Jordan, M.I.: Forward models: Supervised learning with a distal teacher. Cognitive Science 16 (1992) 307–354.
Google Scholar
Kindermann, J., Linden, A.: Inversion of neural networks by gradient descent. Journal of Parallel Computing 14 (1992) 277–286.
Google Scholar
Lee, K., Kil, R. M.: Redundant arm kinematic control with Recurrent Loop. Neural Networks, vol. 7, 4 (1994) 643–659.
Google Scholar
Martín, P., Millán, J. del R.: Learning goal-directed obstacle-avoiding strategies through reinforcement for a two-link sensor-based manipulator. Tech. Report I.96.138, ISIS, JRC-EC, Ispra (1996).
Google Scholar
Millán, J. del R.: Rapid, safe, and incremental learning of navigation strategies. IEEE Trans. on SMC, Part B 26 (1996).
Google Scholar
Millán, J. del R., Torras, C.: A reinforcement connectionist approach to robot path finding in non-maze-like environments. Machine Learning 8 (1992) 363–395.
Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning, 3 (1988) 9–44.
Google Scholar
Tham, C.K., Prager, R.W.: A modular Q-learning architecture for manipulator task decomposition. 11th Int. Conf. on Machine Learning (1994) 309–317.
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8 (1992) 229–256.
Google Scholar
Whitney, D.: The mathematics of coordinated control of prosthetic arms and manipulators. ASME Journal of Dynamic Systems, Mathematics, and Control 94 (1972) 303–309
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, University of Jaume I, 12071, Castellón, Spain
Pedro Martin
Institute for Systems, Informatics and Safety, Joint Research Centre, European Commission, 21020, Ispra, VA, Italy
José del R. Millán

Authors

Pedro Martin
View author publications
You can also search for this author in PubMed Google Scholar
José del R. Millán
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martin, P., Millán, J.d.R. (1997). Combining reinforcement learning and differential inverse kinematics for collision-free motion of multilink manipulators. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032593

Download citation

DOI: https://doi.org/10.1007/BFb0032593
Published: 18 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics