A hybrid generative and predictive model of the motor cortex
Introduction
The prominent regions of motor skill learning are the basal ganglia (the largest part of which is known as striatum), the frontoparietal cortices and the cerebellum. Doya (1999) proposed that reinforcement learning appears in the basal ganglia, supervised learning in the cerebellum and unsupervised learning in the cortex. Hikosaka, Nakamura, Sakai, and Nakahara (2002) propose an integrated model in which cortical motor skill learning is optimised by a cortex basal ganglia loop taking advantage of reinforcement learning and a cortex–cerebellar loop taking advantage of supervised learning. Neurons in the sensorimotor striatum have been observed to be active only during early learning phases of a T-maze procedural task (Jog, Kubota, Connolly, Hillegaart, & Graybiel, 1999). Recently Pasupathy & Miller (2005) observed for monkeys learning an associative task, that striatal (specifically, caudate nucleus) activation progressively anticipates prefrontal cortical activity and that the cortical activity more closely parallels the monkey's behaviour improvement. The lead of striatal activity by as much as 140 ms indicates that the cortex may be a candidate structure to receive training by the basal ganglia (see also Ravel & Richmond, 2005, and the discussion).
Previously we have solved a robotic task by reinforcement learning mimicking basal ganglia function (Weber, Wermter, & Zochios, 2004). In this paper this performance will be copied by an unsupervised learnt generative model as well as with a predictive model trained supervised on the data given by the generative model. The task is that of a robot to approach a target in a pose suitable to grasp it (‘docking’) based on visual input and input conveying its heading angle. It corresponds perhaps to moving the limbs suitable to grasp an object by humans and monkeys. While we assume here that a module previously trained using reinforcements already performs the task, the generative and predictive motor cortex module shall learn it by observation.
The generative and the predictive models are identified with the hierarchically and laterally arranged cortical connections, respectively, an architecture which parallels a combined model of V1 simple and complex cells (Weber, 2001). The docking task intensively uses visual input, which makes it well suited to the function of the cortex. The cerebellum, however, also has, but rather indirect, access to visual information (Robinson, Cohen, May, Sestokas, & Glickstein, 1984).
The traditional role assigned to the motor cortex is to control movement: if e.g. neurons in area F5 of the motor cortex are stimulated, then a limb movement will be executed. Many area F5 neurons, however, also fire in the absence of own motor action: if a monkey observes an action to be executed by another monkey or a human, then so-called mirror neurons fire action potentials (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996). Also, action word representations are proposed to span language as well as motor areas in so-called ‘word webs’ (Pulvermüller, 2002). This leads to a distributed (population-) coding paradigm across several cortical areas, which are mostly topographically organised. Multi-modal connections of the motor areas to higher levels of the somato-sensory and visual cortex (Felleman & Van Essen, 1991) provide the necessary architecture.1 With mirror neurons in motor cortical area F5 having also sensory neuron reponse properties it is therefore plausible to introduce learning principles of sensory cortical areas to the motor cortex. These considerations provide the backbone of our developmental model of one area of the motor cortex.
Monkeys in captivity sometimes imitate behaviours, which they observe from people, e.g. sweeping with a broom (Vittorio Gallese, personal communication). It is tempting to explain imitation behaviour with a generative model, as has been successfully applied to the well investigated lower visual area V1 of the cortex (Bell and Sejnowski, 1997, Olshausen and Field, 1997, Rao and Ballard, 1997, Weber, 2001). A sensory-motor generative model trained by error back-propagation has been proposed by Plaut & Kello (1998): it produces actions (speech), which lead via the environment to a similar sensory input distribution (sound) as previously experienced. However, since imitation behaviour is performed only by higher developed mammals, can we trace back a generative model to a more basic function of the motor cortex?
We go one step back and claim that the cortex reproduces experienced actions, which are originally produced by other parts of the brain rather than by other individuals. An action may originally be produced via phylogenetically older parts of the brain, e.g. the basal ganglia, and possibly via learning mechanisms such as reinforcement learning. Even though it is a powerful learning paradigm, reinforcement learning might leave actions stored inefficiently in the basal ganglia; in computational models a high-dimensional state space limits applications (see Section 4). When the cortex can take over, the basal ganglia would be available to learn new tasks.
An unsupervised model of the cortex as proposed by Doya (1999) is simple due to a similarity with sensory cortex models, and in that the direction of information flow (what is input and what is output) is not specified before learning. While a directed stimulus to response mapping, learnt supervised, might be more optimal, the motor cortex does not produce optimal movements, considering that damage to the cerebellum leads to lasting motor and movement difficulties (e.g. Sanes, Dimitrov, & Hallett, 1990).
Perception-action pairs might be input data to a generative model of the motor cortex, just like an image is the input to a generative model of the visual cortex. The cortical neurons represent this input as an internal activation pattern. This ‘hidden code’ can represent higher-order correlations within the input data and can trace back the input to meaningful, independent components. This constitutes an internal world model of the environment since the input can be generated from that hidden code. Fig. 1(a)) shows the model architecture (together with the predictive model, described next). The input is the perception-action pair. It can be generated via the top-down weights Wtd from the hidden representation . Perceptive, bottom-up weights Wbu are needed to obtain the hidden representation from the data . Weights are trained via an unsupervised learning scheme described below.
The generative model (consisting of Wbu and Wtd) can be used for input–output mappings as we have done recently on a Kohonen network (Wermter & Elshaw, 2003). For this purpose its input is logically split into a perceptive input and a motor output , which can be expressed as . If we now test the model using ‘incomplete’ input where there is perception but zero motor action, we can still find a hidden representation using Wbu. Based on the model can generate a virtual input using Wtd. This motor representation is then the most appropriate action, which belongs to the perception . Having learnt perception-action mappings, the model can automatically generate action sequences within an environment, because an action will lead to a new perception, which will itself trigger an appropriate action.
A predictive model generates its future input or its future state, unlike a generative model, which generates its current input. An advantage of a predictive model is that it can compensate for a delay of its input. Furthermore, since the consequences of an action are known without that the action is actually performed, a predictive model allows for mental simulations of action sequences (Oztop, Wolpert, & Kawato, 2003). Prediction allows the observer to determine if the actor is helpful, unhelpful or threatening (Gallese & Goldman, 1998) and offers the opportunity to react to an action sequence before it has ended (Demiris & Hayes, 2002).
The hierarchically directed weights Wbu and Wtd which constitute the generative model generate the incoming data, which enters the model at the same time instance. On the other hand, the weights V of our predictive model are directed primarily horizontally, or laterally. As can be seen in Fig. 1(a), these weights are recurrent connections within one neural sheet. Fig. 1(b) which depicts the model unfolded over two time steps illustrates that V connect hidden representations at different time steps—making them hetero-associative weights. Note that a trained generative model is needed to obtain the hidden representations, which the predictive model uses for training and for action.
Our model choice to separate feature extraction (the generative model) from prediction originates from modelling constraints in the visual cortex. Simple cell-like localised edge detectors in W are obtained by a training paradigm where the same image has to be reconstructed from the hidden representation, which gave rise to it, but not if, for example, a shifted version of that image should be reconstructed. The prediction task was therefore separated out to lateral weights V, which predict the next hidden representation rather than directly the next input data point. At each relaxation time step, the input images had been moved slightly into a random direction. As a consequence the lateral weights predicted a hidden code that was slightly shift invariant, yielding V1 complex cell properties (Weber, 2001).
Lateral weights V have been extended to connect also different cortical areas, in order to associate ‘what’- and ’where’ representations of an object (Weber & Wermter, 2003). Here we will extend them so that they are allowed to connect to all areas involved, including the input areas (as shown later in Fig. 4). They can then directly predict the future input (consisting of perception and action) and therefore anticipate a learnt motor action. This corresponds to the findings that long-range horizontal connections in the cortex (which originate as well as terminate in layers ⅔) are also found between areas of different hierarchical levels, even though they are strongest within one hierarchical level (Felleman & Van Essen, 1991).
Such a two-tier architecture of feature extracting cells with a layer designed to compute invariances on top resembles the Neocognitron of Fukushima et al. (1983), a biologically inspired model of visual pattern recognition where several hierarchical levels are arranged successively. A more complex architecture is the Helmholtz machine through time by Hinton, Dayan, To, and Neal (1995a). Its recognition component (bottom–up connections) and generative component (top-down connections) are each completed by additional lateral connections which are trained to predict a future state. While such lateral connections yielded shift invariance in (Weber, 2001), contrasting models achieve invariances by vertical connections (e.g. Riesenhuber & Poggio, 2002).
Taking the generative and the predictive model together, we are proposing such a two-tier architecture for a motor cortical area. As in the Helmholtz machine through time we train the lateral connections to predict the future activations on the hidden layer. Since this is done using recurrent relaxations of activations, this introduces the competition as between the attractors of a continuous attractor network. In contrast to a maximum operator or a convolution with a fixed kernel, this is a soft competition involving trained, irregular weights. Having applied our model to the visual system we will here demonstrate its ability to perform sensation driven motor control and mental action simulation in a real world scenario.
Section snippets
Methods
In the following we will briefly describe the task, and then a reinforcement trained model, which can solve the task. Next, we will introduce the model of the motor cortex and explain its components, the generative model and the predictive model, which are trained to copy the solution to the task from the reinforcement trained model.
Anatomy of the generative model
As a result of training, Fig. 5(a–c) together show the recognition weights Wbu. They are made up of the three components to receive the and components of the input, respectively. Adjacent neurons on the hidden area have similar receptive fields in any of the three input areas. This realises topographic mappings. Accordingly, there are regions on the hidden area which are specialised to receive input from similar regions in -space, -space and -space. The central region in
Discussion
In addition to generative models already explaining sensory cortical areas very well, we have demonstrated here that a generative model can account also for motor cortical function. It relies on a working module trained by reinforcement which resides outside of the cortex. A generative motor theory forms a bridge between a generative sensor theory and a reinforcement based motor theory. It is important to have such a link, since sensory and motor representations are largely overlapping in the
Conclusion
We have set up a biologically realistic model of a motor cortical area which can take over and predict actions proposed to be learnt earlier via a reinforcement scheme in the basal ganglia. All network connections are trained, only local learning rules are used and the resulting connectivity structure is topographic where lateral predictive weights are centre-excitatory, surround-inhibitory. This work bridges the wide gap between neuroscience and robotics and motivates the development of the
Acknowledgements
This work is part of the MirrorBot project supported by the EU in the FET-IST programme under grant IST- 2001-35282. We thank Christo Panchev, Alexandros Zochios, Hauke Bartsch as well as the anonymous reviewers for useful contributions to the manuscript.
References (52)
- et al.
A learning algorithm for Boltzmann machines
Cognitive Science
(1985) - et al.
The ‘independent components’ of natural scenes are edge filters
Vision Research
(1997) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
Neural Networks
(1999)- et al.
Modeling parietal–premotor interactions in primate control of grasping
Neural Networks
(1998) - et al.
Mirror neurons and the simulation theory of mind-reading
Trends in Cognitive Sciences
(1998) - et al.
Central mechanisms of motor skill learning
Current Opinion in Neurobiology
(2002) - et al.
Sparse coding with an overcomplete basis set: A strategy employed by V1?
Vision Research
(1997) A brain perspective on language mechanisms: From discrete neuronal ensembles to serial order
Progress in Neurobiology
(2002)- et al.
Neural mechanisms of object recognition
Current Opinion in Neurobiology
(2002) - et al.
Language within our grasp
Trends in Neurosciences
(1998)
The cortical motor system
Neuron
Three dimensional frames of references transformations using recurrent populations of neurons
Neurocomputing
Self-organising continuous attractor networks and motor function
Neural Networks
I know what you are doing: A neurophysical study
Neuron
Robot docking with neural vision and reinforcement
Knowledge-Based Systems
The evolutionary origin of the mammalian isocortex: Towards an integrated developmental and functional approach
The Behavioral and Brain Sciences
Preserved sensitivity to outcome value after lesions of the basolateral amygdala
The Journal of Neuroscience
Generalization in reinforcement learning: Safely approximating the value function
Consolidation in human motor memory
Nature
The Helmholtz machine
Neural Computation
Imitation as a dual-route process featuring prediction and learning components: A biologically-plausible computational model
Reading population codes: A neural implementation of ideal observers
Nature Neuroscience
Efficient computation and cue integration with noisy population codes
Nature Neuroscience
Actions and habits: The development of behavioural autonomy
Philosophical Transactions of the Royal Society of London. Series B
Distributed hierarchical processing in the primate cerebral cortex
Cerebral Cortex
A model of hippocampally dependent navigation, using the temporal difference learning rule
Hippocampus
Cited by (15)
Neural reuse of action perception circuits for language, concepts and communication
2018, Progress in NeurobiologyCitation Excerpt :This language-specific parallelism between production and perception is a core feature of early human communication (Locke, 1993; Werker and Tees, 1999) and its explanation and relationship to the mechanisms underlying human babbling, repetition ability and phonemic discrimination appear essential for understanding language mechanisms. For better understanding the mechanisms underlying the formation of neuronal circuits for language, it is advantageous to build mathematically precise neurocomputational models that resemble the anatomy and function of the brain (Arbib, 2016; Arbib et al., 2000; Chersi et al., 2010; Garagnani et al., 2008; Giese and Rizzolatti, 2015; Guenther and Vladusich, 2012; Keysers and Gazzola, 2014; Kilner et al., 2016; Perlovsky, 2011; Weber et al., 2006; Westermann and Reck Miranda, 2004). The epigenetic emergence of language mechanisms can be modelled using artificial neuronal networks, which adopt the structure of relevant parts of the ‘language cortex’ and operate according to established neurophysiological principles.
The developmental cognitive neuroscience of functional connectivity
2009, Brain and CognitionSmoking, nicotine and visual plasticity: Does what you know, tell you what you can see?
2008, Brain Research BulletinCitation Excerpt :In addition to the separation of terminals carrying information from one eye from those conveying input from the other, the retinal afferent projections are also mapped onto their appropriate brain targets such that neighboring regions of the target receive information from neighboring regions of visual space [17,82]. The topographic maps that are created are thought to be essential for such functions as depth perception, object recognition, reconstruction of a visual scene and visual guidance of behaviors [7,16,47,67,98]. The mapping process has been studied extensively in the optic tectum, the non-mammalian vertebrate homologue of the superior colliculus [82].
Connectivity concepts in neuronal network modeling
2022, PLoS Computational Biology