Biological arm motion through reinforcement learning

Izawa, Jun; Kondo, Toshiyuki; Ito, Koji

doi:10.1007/s00422-004-0485-3

Biological arm motion through reinforcement learning

Published: 09 August 2004

Volume 91, pages 10–22, (2004)
Cite this article

Biological Cybernetics Aims and scope Submit manuscript

Jun Izawa¹,
Toshiyuki Kondo² &
Koji Ito²

501 Accesses
40 Citations
Explore all metrics

Abstract.

The present paper discusses an optimal learning control method using reinforcement learning for biological systems with a redundant actuator. It is difficult to apply reinforcement learning to biological control systems because of the redundancy in muscle activation space. We solve this problem with the following method. First, we divide the control input space into two subspaces according to a priority order of learning and restrict the search noise for reinforcement learning to the first priority subspace. Then the constraint is reduced as the learning progresses, with the search space extending to the second priority subspace. The higher priority subspace is designed so that the impedance of the arm can be high. A smooth reaching motion is obtained through reinforcement learning without any previous knowledge of the arm’s dynamics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

An K, Kwak B, Chao E, Morrey B (1984) Determination of muscle and joint forces: a new technique to solve the indeterminate problem. Trans Am Soc Mech Eng 106:364–367
Barto A, Sutton R, Abderson C (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Sys Man Cybern 13(5):834–846
Bizzi E, Mussa-Ivaldi FA, Giszter S (1991) Computations underlying the execution of movement: a biological perspective. Science 253:287–291
Dormont J, Conde H, Farin D (1998) The role of the pedunculopontine tegmental nucleus in relation to conditioned motor performance in the cat. Exp Brain Res 121:401–410
Doya K (2000) Reinforcement Learning in continuous time and space. Neural Comput 12: 219–245
Doya K (1999) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw 12:961–974
Doya K (2002) Metalearning and neuromodulation. Neural Netw 15:495–506
Feldman A (1986) Once more on the equilibrium point hypothesis (λ model) for motor control. J Mot Behav 18(1):17–54
Flash T, Hogan N (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5:1688–1703
Harris C (1998) On the optimal control of behavior: a stochastic perspective. J Neurosci Methods 83:73–88
Hogan N (1984) Adaptive control of mechanical impedance by coactivation of antagonistic muscles. IEEE Trans Automat Control AC (29):681–690
Houk J, Adams J, Barto A (1994) A model of how the basal ganglia generate and use neural signals that predict reinforcement. In: Models of information processing in the basal ganglia. MIT Press, Cambridge, MA
Ito K, Ito M (1991) Motion control in living bodies and roberts (in Japanese). Soc Instrum Control Eng pp. 133–140
Joel D, Niv Y, Ruppin E (2002) Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw 15:535–547
Jordan M, Wolpert D (1999) The cognitive neuroscience, chap 42. MIT Press, Cambridge, MA
Katayama M, Inoue S, Kawato M (1993) A strategy of motor learning using adjustable parameters for arm movement. In: Proceedings of the 20th annual international conference of the IEEE Engineering in Medicine and Biology Society, pp 2370–2373
Mclntyre J, Bizzi E (1993) Servo hypotheses for the biological control of movement. J Mot Behav 25(3):193–202
Montague P, Dayan P, Sejnowski T (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936–1947
Nelson W (1983) Physical principles for economies of skilled movements. Biol Cybern 46:135–147
Osu R, Franklin D, Kato H, Gomi H, Yoshioka KDT, Kawato M (2002) Short- and long-term changes in joint co-contraction associated with motor learning as revealed from surface emg. J Neurophysiol 88:991–1004
Pearson K, Gordon J (2000) Spinal reflexes, chap 36. McGraw-Hill, New York
Saltiel P, Wyler-duda K, D’Avella A, Tresch M, Bizzi E (2001) Muscle synergies encoded within the spinal cord: evidence from focal intraspinal nmda iontophoresis in the frog. J Neurophysiol 85:605–619
Sanger T (1994) Neural network learning control of robot manipulators using gradually increasing task difficulty. IEEE Trans Robot Automat 10(3):323–333
Schultz W, Dayan P, Montague P (1997) A neural substrate of prediction and reward. Science 275(14):1593–1598
Shibata K, Sugisaka M, Ito K (2000) Hand reaching movement acquired through reinforcement learning. In: Proceedings of Korea Automatic Control Conference (KACC 2000), vol 90 (CD-ROM)
Suri R (2002) Td models of reward predictive responses in dopamine neurons. Neural Netw 15:523–533
Sutton R, Barto A (1998) Reinforcement learning. MIT Press, Cambridge, MA
Takakusaki K, Habaguchi T, Ohinata-sugimoto J, Saito K, Sakamoto T (2003) Basal ganglia efferents to the brainstem centers controlling postual muscle tone and locomotion: a new concept for understanding motor disorders in basal ganglia dysfunction. J Neurosci 119:293–308
Takakusaki K, Kohyama J, Matsuyama K, Mori S (2001) Medullary reticulospinal tract mediating the generalized motor inhibition in cats: parallel inhibitory mechanisms acting on motoneurons and on interneuronal transmission in reflex pathways. J Neurosci 103:511–527
Thelen E, Smith L (1994) Dynamic systems approach to the development of cognition and action. MIT Press/Bradford Books, Cambridge, MA
Thoroughman K, Shadmehr R (1999) Electromyographic correlates of learning an internal model of reaching movements. J Neurosci 19(19):8573–8588
Todrodov E, Jordan M (2002) Optimal feedback control as a theory of motor coordination. Nat Neurosci 5(11):1226–1235
Tresch M, Saltiel P, Bizzi E (1999) The construction of movement by the spinal cord. Nat Neurosci 2:162–167
Uno Y, Kawato M, Suzuki R (1989) Formation and control of optimal trajectory in human multijoint arm movement. Minimum torque-change model. Biol Cybern 61(2):89–101

Download references

Author information

Authors and Affiliations

Sensory and Motor Research Group, Human and Information Science Laboratory, NTT Communication Science Laboratories 3-1, Morinosato-Wakamiya, Atsugi-shi, 243-01, Japan
Jun Izawa
Department of Computational Intelligence and Systems Science, Interdisciplinary Graduate School Science and Engineering, Tokyo Institute of Technology, Yokohama, Japan
Toshiyuki Kondo & Koji Ito

Authors

Jun Izawa
View author publications
You can also search for this author in PubMed Google Scholar
Toshiyuki Kondo
View author publications
You can also search for this author in PubMed Google Scholar
Koji Ito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Izawa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Izawa, J., Kondo, T. & Ito, K. Biological arm motion through reinforcement learning. Biol. Cybern. 91, 10–22 (2004). https://doi.org/10.1007/s00422-004-0485-3

Download citation

Received: 11 December 2002
Accepted: 21 April 2004
Published: 09 August 2004
Issue Date: July 2004
DOI: https://doi.org/10.1007/s00422-004-0485-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Biological arm motion through reinforcement learning

Abstract.

Access this article

Similar content being viewed by others

A review of motion planning algorithms for intelligent robots

Modeling and Simulation of Dynamics in Soft Robotics: a Review of Numerical Approaches

Bipedal Humanoid Hardware Design: a Technology Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Biological arm motion through reinforcement learning

Abstract.

Access this article

Similar content being viewed by others

A review of motion planning algorithms for intelligent robots

Modeling and Simulation of Dynamics in Soft Robotics: a Review of Numerical Approaches

Bipedal Humanoid Hardware Design: a Technology Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation