Abstract
Conventional robot control schemes are basically model-based methods. However, exact modeling of robot dynamics poses considerable problems and faces various uncertainties in task execution. This paper proposes a reinforcement learning control approach for overcoming such drawbacks. An artificial neural network (ANN) serves as the learning structure, and an applied stochastic real-valued (SRV) unit as the learning method. Initially, force tracking control of a two-link robot arm is simulated to verify the control design. The simulation results confirm that even without information related to the robot dynamic model and environment states, operation rules for simultaneous controlling force and velocity are achievable by repetitive exploration. Hitherto, however, an acceptable performance has demanded many learning iterations and the learning speed proved too slow for practical applications. The approach herein, therefore, improves the tracking performance by combining a conventional controller with a reinforcement learning strategy. Experimental results demonstrate improved trajectory tracking performance of a two-link direct-drive robot manipulator using the proposed method.
Similar content being viewed by others
References
An, C. H., Atkeson, C. G., and Hollerbach, J. M.: Model Based Control of a Robot Manipulator, MIT Press, Cambridge, MA, 1988.
Albus, J. S.: A new approach to manipulator control: the cerebellar model articulation controller (CMAC), Trans. of ASME, Series G 97(3) (1975), 220–227.
Barto, A. G.: Connectionist learning for control, in: W. T. Miller, R. Sutton, and P. Werbos (eds), Neural Networks for Control, MIT Press, Cambridge, MA, 1990.
Barto, A. G. and Anandan, P.: Pattern-recognizing stochastic learning automata, IEEE Trans. Systems Man Cybernet. 15(3) (1985), 360–375.
Gullapalli, V., Franklin, J. A., and Benbrahim, H.: Acquiring robot skills via reinforcement learning, IEEE Control Systems 14(1) (1994), 13–24.
Gullapalli, V.: Associative reinforcement learning of real-valued functions, in: Proc. of the IEEE Int. Conf. on Systems Man Cybernet., 1991, pp. 1453–1458.
Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real-valued functions, Neural Networks 3 (1990), 671–692.
Michie, D. and Chambers, R. A.: BOXES: an experiment in adaptive control, in: E. Dale and D. Michie (eds), Machine Intelligence 2, 1986, pp. 137–152.
Miller, W. T., Hewes, R. P., Glanz, F. H., and Kraft, L. G.: Real-time dynamic control of an industrial manipulator using a neural-network-based learning controller, IEEE Trans. Robot. Automat. 6(1) (1990), 1–9.
Narendra, K. S. and Thathachar, M. A. L.: Learning automata – a survey, IEEE Trans. Systems Man Cybernet. 14 (1974), 323–334.
Raibert, M. H. and Craig, J.: Hybrid position/force control of manipulators, Trans. ASME J. Dyn. Systems Meas. Control 102 (1981), 126–133.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J.: Parallel Distributed Processing, MIT Press, Cambridge, MA, 1986.
Song, K. T. and Chu, T. S.: An experimental study of force tracking control by reinforcement learning, in: Proc. 1994 Internat. Symp. on Artificial Neural Networks, Taiwan, 1994, pp. 728–734.
Sun, W. Y.: Control design and experimental study of a robot using reinforcement learning, Master thesis, National Chiao Tung Univ., 1995.
Sutton, R. S.: Learning to predict by the method of temporal difference, Machine Learning 3(1) (1988), 9–44.
Werbos, P. J.: Generalization of back propagation with application to a recurrent gas market model, Neural Networks 1 (October, 1988), 339–356.
Widrow, B. and Stearns, S. D.: Adaptive Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, 1985.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Song, KT., Sun, WY. Robot Control Optimization Using Reinforcement Learning. Journal of Intelligent and Robotic Systems 21, 221–238 (1998). https://doi.org/10.1023/A:1007904418265
Issue Date:
DOI: https://doi.org/10.1023/A:1007904418265