Abstract
This paper presents a class of neural controllers that learn goal-oriented obstacle-avoiding strategies for multilink manipulators. They acquire these strategies on-line through reinforcement learning from local sensory data. These controllers are mainly made of two neural modules: a reinforcement-based action generator and a module for differential inverse kinematics (DIV). The action generator generates actions with regard to a goal vector in the manipulator joint space. Suitable goal vectors are provided by the DIV module. This module is based on the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics in polar coordinates. Results for two- and three-link planar manipulators are shown. These controllers achieve a good performance quite rapidly and exhibit good generalization capabilities in the face of new environments.
Preview
Unable to display preview. Download preview PDF.
References
Baginski, B., Eldracher, M.: Path planning with neural subgoal search. IEEE World Congress on Computational Intelligence (1994) 2732–2736.
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. on SMC 13 (1983) 835–846.
Barto, A.G., Sutton, R.S., Watkins, C.J.: Learning and sequential decision making. Tech. Report 89-95, University of Massachusetts, Amherst (1989).
Jordan, M.I.: Forward models: Supervised learning with a distal teacher. Cognitive Science 16 (1992) 307–354.
Kindermann, J., Linden, A.: Inversion of neural networks by gradient descent. Journal of Parallel Computing 14 (1992) 277–286.
Lee, K., Kil, R. M.: Redundant arm kinematic control with Recurrent Loop. Neural Networks, vol. 7, 4 (1994) 643–659.
Martín, P., Millán, J. del R.: Learning goal-directed obstacle-avoiding strategies through reinforcement for a two-link sensor-based manipulator. Tech. Report I.96.138, ISIS, JRC-EC, Ispra (1996).
Millán, J. del R.: Rapid, safe, and incremental learning of navigation strategies. IEEE Trans. on SMC, Part B 26 (1996).
Millán, J. del R., Torras, C.: A reinforcement connectionist approach to robot path finding in non-maze-like environments. Machine Learning 8 (1992) 363–395.
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning, 3 (1988) 9–44.
Tham, C.K., Prager, R.W.: A modular Q-learning architecture for manipulator task decomposition. 11th Int. Conf. on Machine Learning (1994) 309–317.
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8 (1992) 229–256.
Whitney, D.: The mathematics of coordinated control of prosthetic arms and manipulators. ASME Journal of Dynamic Systems, Mathematics, and Control 94 (1972) 303–309
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martin, P., Millán, J.d.R. (1997). Combining reinforcement learning and differential inverse kinematics for collision-free motion of multilink manipulators. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032593
Download citation
DOI: https://doi.org/10.1007/BFb0032593
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive