Elsevier

Neural Networks

Volume 19, Issue 4, May 2006, Pages 339-353
Neural Networks

A hybrid generative and predictive model of the motor cortex

https://doi.org/10.1016/j.neunet.2005.10.004Get rights and content

Abstract

We describe a hybrid generative and predictive model of the motor cortex. The generative model is related to the hierarchically directed cortico-cortical (or thalamo-cortical) connections and unsupervised training leads to a topographic and sparse hidden representation of its sensory and motor input. The predictive model is related to lateral intra-area and inter-area cortical connections, functions as a hetero-associator attractor network and is trained to predict the future state of the network. Applying partial input, the generative model can map sensory input to motor actions and can thereby perform learnt action sequences of the agent within the environment. The predictive model can additionally predict a longer perception- and action sequence (mental simulation). The models' performance is demonstrated on a visually guided robot docking manoeuvre. We propose that the motor cortex might take over functions previously learnt by reinforcement in the basal ganglia and relate this to mirror neurons and imitation.

Introduction

The prominent regions of motor skill learning are the basal ganglia (the largest part of which is known as striatum), the frontoparietal cortices and the cerebellum. Doya (1999) proposed that reinforcement learning appears in the basal ganglia, supervised learning in the cerebellum and unsupervised learning in the cortex. Hikosaka, Nakamura, Sakai, and Nakahara (2002) propose an integrated model in which cortical motor skill learning is optimised by a cortex basal ganglia loop taking advantage of reinforcement learning and a cortex–cerebellar loop taking advantage of supervised learning. Neurons in the sensorimotor striatum have been observed to be active only during early learning phases of a T-maze procedural task (Jog, Kubota, Connolly, Hillegaart, & Graybiel, 1999). Recently Pasupathy & Miller (2005) observed for monkeys learning an associative task, that striatal (specifically, caudate nucleus) activation progressively anticipates prefrontal cortical activity and that the cortical activity more closely parallels the monkey's behaviour improvement. The lead of striatal activity by as much as 140 ms indicates that the cortex may be a candidate structure to receive training by the basal ganglia (see also Ravel & Richmond, 2005, and the discussion).

Previously we have solved a robotic task by reinforcement learning mimicking basal ganglia function (Weber, Wermter, & Zochios, 2004). In this paper this performance will be copied by an unsupervised learnt generative model as well as with a predictive model trained supervised on the data given by the generative model. The task is that of a robot to approach a target in a pose suitable to grasp it (‘docking’) based on visual input and input conveying its heading angle. It corresponds perhaps to moving the limbs suitable to grasp an object by humans and monkeys. While we assume here that a module previously trained using reinforcements already performs the task, the generative and predictive motor cortex module shall learn it by observation.

The generative and the predictive models are identified with the hierarchically and laterally arranged cortical connections, respectively, an architecture which parallels a combined model of V1 simple and complex cells (Weber, 2001). The docking task intensively uses visual input, which makes it well suited to the function of the cortex. The cerebellum, however, also has, but rather indirect, access to visual information (Robinson, Cohen, May, Sestokas, & Glickstein, 1984).

The traditional role assigned to the motor cortex is to control movement: if e.g. neurons in area F5 of the motor cortex are stimulated, then a limb movement will be executed. Many area F5 neurons, however, also fire in the absence of own motor action: if a monkey observes an action to be executed by another monkey or a human, then so-called mirror neurons fire action potentials (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996). Also, action word representations are proposed to span language as well as motor areas in so-called ‘word webs’ (Pulvermüller, 2002). This leads to a distributed (population-) coding paradigm across several cortical areas, which are mostly topographically organised. Multi-modal connections of the motor areas to higher levels of the somato-sensory and visual cortex (Felleman & Van Essen, 1991) provide the necessary architecture.1 With mirror neurons in motor cortical area F5 having also sensory neuron reponse properties it is therefore plausible to introduce learning principles of sensory cortical areas to the motor cortex. These considerations provide the backbone of our developmental model of one area of the motor cortex.

Monkeys in captivity sometimes imitate behaviours, which they observe from people, e.g. sweeping with a broom (Vittorio Gallese, personal communication). It is tempting to explain imitation behaviour with a generative model, as has been successfully applied to the well investigated lower visual area V1 of the cortex (Bell and Sejnowski, 1997, Olshausen and Field, 1997, Rao and Ballard, 1997, Weber, 2001). A sensory-motor generative model trained by error back-propagation has been proposed by Plaut & Kello (1998): it produces actions (speech), which lead via the environment to a similar sensory input distribution (sound) as previously experienced. However, since imitation behaviour is performed only by higher developed mammals, can we trace back a generative model to a more basic function of the motor cortex?

We go one step back and claim that the cortex reproduces experienced actions, which are originally produced by other parts of the brain rather than by other individuals. An action may originally be produced via phylogenetically older parts of the brain, e.g. the basal ganglia, and possibly via learning mechanisms such as reinforcement learning. Even though it is a powerful learning paradigm, reinforcement learning might leave actions stored inefficiently in the basal ganglia; in computational models a high-dimensional state space limits applications (see Section 4). When the cortex can take over, the basal ganglia would be available to learn new tasks.

An unsupervised model of the cortex as proposed by Doya (1999) is simple due to a similarity with sensory cortex models, and in that the direction of information flow (what is input and what is output) is not specified before learning. While a directed stimulus to response mapping, learnt supervised, might be more optimal, the motor cortex does not produce optimal movements, considering that damage to the cerebellum leads to lasting motor and movement difficulties (e.g. Sanes, Dimitrov, & Hallett, 1990).

Perception-action pairs might be input data to a generative model of the motor cortex, just like an image is the input to a generative model of the visual cortex. The cortical neurons represent this input as an internal activation pattern. This ‘hidden code’ can represent higher-order correlations within the input data and can trace back the input to meaningful, independent components. This constitutes an internal world model of the environment since the input can be generated from that hidden code. Fig. 1(a)) shows the model architecture (together with the predictive model, described next). The input x is the perception-action pair. It can be generated via the top-down weights Wtd from the hidden representation r. Perceptive, bottom-up weights Wbu are needed to obtain the hidden representation r from the data x. Weights are trained via an unsupervised learning scheme described below.

The generative model (consisting of Wbu and Wtd) can be used for input–output mappings as we have done recently on a Kohonen network (Wermter & Elshaw, 2003). For this purpose its input x is logically split into a perceptive input y and a motor output m, which can be expressed as x=(y,m). If we now test the model using ‘incomplete’ input (y,0) where there is perception but zero motor action, we can still find a hidden representation r using Wbu. Based on r the model can generate a virtual input x=(y,m) using Wtd. This motor representation m is then the most appropriate action, which belongs to the perception y. Having learnt perception-action mappings, the model can automatically generate action sequences within an environment, because an action will lead to a new perception, which will itself trigger an appropriate action.

A predictive model generates its future input or its future state, unlike a generative model, which generates its current input. An advantage of a predictive model is that it can compensate for a delay of its input. Furthermore, since the consequences of an action are known without that the action is actually performed, a predictive model allows for mental simulations of action sequences (Oztop, Wolpert, & Kawato, 2003). Prediction allows the observer to determine if the actor is helpful, unhelpful or threatening (Gallese & Goldman, 1998) and offers the opportunity to react to an action sequence before it has ended (Demiris & Hayes, 2002).

The hierarchically directed weights Wbu and Wtd which constitute the generative model generate the incoming data, which enters the model at the same time instance. On the other hand, the weights V of our predictive model are directed primarily horizontally, or laterally. As can be seen in Fig. 1(a), these weights are recurrent connections within one neural sheet. Fig. 1(b) which depicts the model unfolded over two time steps illustrates that V connect hidden representations at different time steps—making them hetero-associative weights. Note that a trained generative model is needed to obtain the hidden representations, which the predictive model uses for training and for action.

Our model choice to separate feature extraction (the generative model) from prediction originates from modelling constraints in the visual cortex. Simple cell-like localised edge detectors in W are obtained by a training paradigm where the same image has to be reconstructed from the hidden representation, which gave rise to it, but not if, for example, a shifted version of that image should be reconstructed. The prediction task was therefore separated out to lateral weights V, which predict the next hidden representation r(t+1) rather than directly the next input data point. At each relaxation time step, the input images had been moved slightly into a random direction. As a consequence the lateral weights predicted a hidden code that was slightly shift invariant, yielding V1 complex cell properties (Weber, 2001).

Lateral weights V have been extended to connect also different cortical areas, in order to associate ‘what’- and ’where’ representations of an object (Weber & Wermter, 2003). Here we will extend them so that they are allowed to connect to all areas involved, including the input areas (as shown later in Fig. 4). They can then directly predict the future input (consisting of perception and action) and therefore anticipate a learnt motor action. This corresponds to the findings that long-range horizontal connections in the cortex (which originate as well as terminate in layers ⅔) are also found between areas of different hierarchical levels, even though they are strongest within one hierarchical level (Felleman & Van Essen, 1991).

Such a two-tier architecture of feature extracting cells with a layer designed to compute invariances on top resembles the Neocognitron of Fukushima et al. (1983), a biologically inspired model of visual pattern recognition where several hierarchical levels are arranged successively. A more complex architecture is the Helmholtz machine through time by Hinton, Dayan, To, and Neal (1995a). Its recognition component (bottom–up connections) and generative component (top-down connections) are each completed by additional lateral connections which are trained to predict a future state. While such lateral connections yielded shift invariance in (Weber, 2001), contrasting models achieve invariances by vertical connections (e.g. Riesenhuber & Poggio, 2002).

Taking the generative and the predictive model together, we are proposing such a two-tier architecture for a motor cortical area. As in the Helmholtz machine through time we train the lateral connections to predict the future activations on the hidden layer. Since this is done using recurrent relaxations of activations, this introduces the competition as between the attractors of a continuous attractor network. In contrast to a maximum operator or a convolution with a fixed kernel, this is a soft competition involving trained, irregular weights. Having applied our model to the visual system we will here demonstrate its ability to perform sensation driven motor control and mental action simulation in a real world scenario.

Section snippets

Methods

In the following we will briefly describe the task, and then a reinforcement trained model, which can solve the task. Next, we will introduce the model of the motor cortex and explain its components, the generative model and the predictive model, which are trained to copy the solution to the task from the reinforcement trained model.

Anatomy of the generative model

As a result of training, Fig. 5(a–c) together show the recognition weights Wbu. They are made up of the three components to receive the p,Φandm and components of the input, respectively. Adjacent neurons on the hidden area have similar receptive fields in any of the three input areas. This realises topographic mappings. Accordingly, there are regions on the hidden area which are specialised to receive input from similar regions in p-space, Φ-space and m-space. The central region in p

Discussion

In addition to generative models already explaining sensory cortical areas very well, we have demonstrated here that a generative model can account also for motor cortical function. It relies on a working module trained by reinforcement which resides outside of the cortex. A generative motor theory forms a bridge between a generative sensor theory and a reinforcement based motor theory. It is important to have such a link, since sensory and motor representations are largely overlapping in the

Conclusion

We have set up a biologically realistic model of a motor cortical area which can take over and predict actions proposed to be learnt earlier via a reinforcement scheme in the basal ganglia. All network connections are trained, only local learning rules are used and the resulting connectivity structure is topographic where lateral predictive weights are centre-excitatory, surround-inhibitory. This work bridges the wide gap between neuroscience and robotics and motivates the development of the

Acknowledgements

This work is part of the MirrorBot project supported by the EU in the FET-IST programme under grant IST- 2001-35282. We thank Christo Panchev, Alexandros Zochios, Hauke Bartsch as well as the anonymous reviewers for useful contributions to the manuscript.

References (52)

  • G. Rizzolatti et al.

    The cortical motor system

    Neuron

    (2001)
  • E. Sauser et al.

    Three dimensional frames of references transformations using recurrent populations of neurons

    Neurocomputing

    (2005)
  • S. Stringer et al.

    Self-organising continuous attractor networks and motor function

    Neural Networks

    (2003)
  • M. Umilta et al.

    I know what you are doing: A neurophysical study

    Neuron

    (2001)
  • C. Weber et al.

    Robot docking with neural vision and reinforcement

    Knowledge-Based Systems

    (2004)
  • F. Aboitiz et al.

    The evolutionary origin of the mammalian isocortex: Towards an integrated developmental and functional approach

    The Behavioral and Brain Sciences

    (2003)
  • P. Blundell et al.

    Preserved sensitivity to outcome value after lesions of the basolateral amygdala

    The Journal of Neuroscience

    (2003)
  • J. Boyan et al.

    Generalization in reinforcement learning: Safely approximating the value function

  • T. Brashers-Krug et al.

    Consolidation in human motor memory

    Nature

    (1996)
  • P. Dayan et al.

    The Helmholtz machine

    Neural Computation

    (1995)
  • Y. Demiris et al.

    Imitation as a dual-route process featuring prediction and learning components: A biologically-plausible computational model

  • S. Deneve et al.

    Reading population codes: A neural implementation of ideal observers

    Nature Neuroscience

    (1999)
  • S. Deneve et al.

    Efficient computation and cue integration with noisy population codes

    Nature Neuroscience

    (2001)
  • A. Dickinson

    Actions and habits: The development of behavioural autonomy

    Philosophical Transactions of the Royal Society of London. Series B

    (1985)
  • D. Felleman et al.

    Distributed hierarchical processing in the primate cerebral cortex

    Cerebral Cortex

    (1991)
  • D. Foster et al.

    A model of hippocampally dependent navigation, using the temporal difference learning rule

    Hippocampus

    (2000)
  • Cited by (15)

    • Neural reuse of action perception circuits for language, concepts and communication

      2018, Progress in Neurobiology
      Citation Excerpt :

      This language-specific parallelism between production and perception is a core feature of early human communication (Locke, 1993; Werker and Tees, 1999) and its explanation and relationship to the mechanisms underlying human babbling, repetition ability and phonemic discrimination appear essential for understanding language mechanisms. For better understanding the mechanisms underlying the formation of neuronal circuits for language, it is advantageous to build mathematically precise neurocomputational models that resemble the anatomy and function of the brain (Arbib, 2016; Arbib et al., 2000; Chersi et al., 2010; Garagnani et al., 2008; Giese and Rizzolatti, 2015; Guenther and Vladusich, 2012; Keysers and Gazzola, 2014; Kilner et al., 2016; Perlovsky, 2011; Weber et al., 2006; Westermann and Reck Miranda, 2004). The epigenetic emergence of language mechanisms can be modelled using artificial neuronal networks, which adopt the structure of relevant parts of the ‘language cortex’ and operate according to established neurophysiological principles.

    • Smoking, nicotine and visual plasticity: Does what you know, tell you what you can see?

      2008, Brain Research Bulletin
      Citation Excerpt :

      In addition to the separation of terminals carrying information from one eye from those conveying input from the other, the retinal afferent projections are also mapped onto their appropriate brain targets such that neighboring regions of the target receive information from neighboring regions of visual space [17,82]. The topographic maps that are created are thought to be essential for such functions as depth perception, object recognition, reconstruction of a visual scene and visual guidance of behaviors [7,16,47,67,98]. The mapping process has been studied extensively in the optic tectum, the non-mammalian vertebrate homologue of the superior colliculus [82].

    View all citing articles on Scopus
    View full text