Elsevier

Neural Networks

Volume 47, November 2013, Pages 64-71
Neural Networks

2013 Special Issue
Nucleo-olivary inhibition balances the interaction between the reactive and adaptive layers in motor control

https://doi.org/10.1016/j.neunet.2013.01.026Get rights and content

Abstract

In the acquisition of adaptive motor reflexes to aversive stimuli, the cerebellar output fulfills a double purpose: it controls a motor response and it relays a sensory prediction. However, the question of how these two apparently incompatible goals might be achieved by the same cerebellar area remains open. Here we propose a solution where the inhibition of the Inferior Olive (IO) by the cerebellar Deep Nuclei (DN) translates the motor command signal into a sensory prediction allowing a single cerebellar area to simultaneously tackle both aspects of the problem: execution and prediction. We demonstrate that having a graded error signal, the gain of the Nucleo-Olivary Inhibition (NOI) balances the generation of the response between the cerebellar and the reflexive controllers or, in other words, between the adaptive and the reactive layers of behavior. Moreover, we show that the resulting system is fully autonomous and can either acquire or erase adaptive responses according to their utility.

Introduction

The execution of an avoidance action seems to involve both sensory prediction and motor control: the prediction of a noxious stimulus triggers an anticipatory motor command. A similar division between sensory prediction and actuation is also found in control theory when a forward model provides predicted feedback to a feedback controller (Miall, Weir, Wolpert, & Stein, 1993). In the latter case, the reactive commands of the feedback controller are caused by the factual feedback anticipated by the forward model. On the contrary, in the case of an avoidance action, a common sense interpretation suggests that the predicted sensory event is counterfactual, i.e., not the factual sensory event predicted but the one that would be perceived without the avoidance action. Here we will show that to understand the role of the cerebellum in Avoidance Learning (AL) one might have to drop this assumption.

Acquisition of anticipatory responses has been extensively studied with the paradigm of Pavlovian classical conditioning (Pavlov & Anrep, 1927), e.g., classical conditioning of the eyeblink reflex (Gormezano, Prokasy, & Thompson, 1987) (henceforth, eyeblink conditioning). In eyeblink conditioning a neutral cue such as a sound or a light, the Conditioning Stimulus (CS), precedes by a fixed time-interval the delivery of a noxious Unconditioned Stimulus (US) to the eye, e.g., a peri-orbital electric shock. The US occurrence triggers a reflexive protective action (the closure of the eyelid) that constitutes the Unconditioned Response (UR). After a number of paired CS–US repetitions, the subject reacts to the delivery of the CS by closing the eyelids in anticipation of the expected US, i.e., producing a Conditioned Response (CR) (Gormezano et al., 1987, Mackintosh, 1974, Pavlov and Anrep, 1927). Once acquired, CRs can be deleted by extinction training, i.e., presenting CSs not followed by the US.

There is broad agreement that the substrate of learning in eyeblink conditioning is located in the cerebellum (Christian and Thompson, 2003, Yeo and Hesslow, 1998). The well known cerebellar circuitry (Eccles, Ito, & Szentágothai, 1967) helped to accurately delineate the neural pathways of CS, US and CR (Mauk et al., 1986, Steinmetz et al., 1985). The roles of the different stimuli accord with Marr–Albus–Ito cerebellar learning theory (Albus, 1971, Ito et al., 1982, Marr, 1969): the US signal relayed by the IO reaches the cerebellar cortex through the climbing fibers where it induces plasticity at the synapses of the parallel fibers that transmit the CS information. After repeated coincidence of these two signals, the Purkinje cells–the sole output of the cerebellar cortex–acquire a response to the CS, namely, a drop in their firing activity, that drives the behavioral CR (Jirenhed, Bengtsson, & Hesslow, 2007). Moreover, as with the overt CR, extinction training suppresses the Purkinje cell response.

Learning in classical conditioning regards sensory prediction. As the Rescorla–Wagner model formalized, animals learn in classical conditioning only when events violate their expectations (Rescorla & Wagner, 1972). Therefore, to support this kind of learning the cerebellum must acquire and generate sensory predictions. In general, according to the adaptive filter theory, cerebellar learning is explained in terms of de-correlation (Fujita, 1982). A corollary of this theory is that the cerebellum only learns when the IO activity is perturbed from baseline. In this context, the inhibitory connections from the cerebellar deep nuclear cells to the Inferior Olive, the Nucleo Olivary Inhibition (NOI) (Andersson, Garwicz, & Hesslow, 1988), are key to interpret cerebellar learning as the acquisition of sensory predictions. The NOI subtracts the cerebellar output relayed by the DN from the US signal reaching the IO, such that if both the signals match, they cancel each other leaving IO activity at baseline. Therefore, in eyeblink conditioning, if after the CS either the excitation produced by the US or the inhibition produced by the CR (via the NOI) outweighs the other, the perturbation of the IO activity would recruit cerebellar plasticity such that in following trials IO activity will remain closer to baseline. Remarkably, the NOI has an unusual long latency for a monosynaptic transmission in the order of the tens of milliseconds (Hesslow, 1986).

Regarding motor control, it is well-established that the output of the cerebellum drives the CR (Hesslow, 1994). In itself, this does not contradict the sensory prediction interpretation if the predicted US stimulus and the amplitude of the CR are correlated (although it is not obvious why such a correlation should exist). In other words, since correlation between neural activity and stimulus intensity–or action amplitude–is interpreted as evidence for the neural activity coding the stimulus–or the action–then, in classical conditioning, the cerebellar output may code both the perception and the response if perception and response are themselves correlated. However, the question remains whether the NOI, fundamental for sensory prediction, is functional from a motor learning perspective. AL, which, as a paradigm, is closely related to classical conditioning, serves us to address this issue.

In a classical conditioning preparation the CR is required to not ameliorate or reduce the noxiousness of the US. For instance, with a peri-orbital shock US, the CR has no effect in reducing the intensity of the shock. In AL, the CR modifies the effect of the US. For instance, if we use an airpuff without restraining the eyelids, then the effective or sensed intensity of the US will decrease as the CR increases the degree of eyelid closure, i.e., the noxiousness of the US will diminish as it reaches a more protected cornea. Therefore, whereas in classical conditioning the CR and the US can only be compared internally (and by means of the NOI), in AL an implicit comparison between action and stimulus takes place in the external world. This difference between classical conditioning and AL is not always explicitly made in the literature, since some eyeblink conditioning studies, specially with humans subjects, are made with an airpuff and an unconstrained CR (Clark & Squire, 1998).

However, attempting to apply the cerebellar microcircuit studied classical conditioning to a task of AL raises a series of questions.

First, if the cerebellum outputs a motor command and the IO receives a peripheral sensory signal, then the NOI performs a non consistent comparison between information from different domains. In such a case, why should the temporal profile of the signal masking a phasic US be similar to the motor command controlling the eyelid muscles? (Lepora, Porrill, Yeo, & Dean, 2010). Note that the same inconsistency of the temporal dynamics appears when we consider the avoidance of a noxious stimulus as a comparison performed in the external world. E.g., the temporal profile of the eyelid closure and the physical US stimulus might be different.

And second, AL introduces a contingency between the motor action and the sensory prediction: the CR diminishes the effective intensity of the US. We refer to this link as the behavioral negative feedback loop in contrast to the internal negative feedback provided by the NOI. But if the behavioral learning can avoid the US, what is the use of the internal negative feedback? Remark that in cases where avoidance can be complete (to hit against a wall or to completely avoid it) the role of the NOI is not evident, i.e., since both negative feedback loops are superposed, the NOI might halt learning before it leads to the total avoidance of the US.

However, it has been shown both with modeling and animal preparations that inactivation of the NOI prevents extinction in classical conditioning (Bengtsson and Hesslow, 2006, Medina et al., 2002). Extrapolating this result to AL, then the NOI has the functional role of suppressing acquired responses that are no longer adaptive. Therefore, even though it could be possible for a cerebellar microcircuit lacking the NOI to optimally acquire a response in AL, such circuit would require an extra-cerebellar brain structure to generate the signal driving extinction. In other words, in the absence of an external signal reflecting the cost of an unnecessary avoidance action, this signal, playing the role of a hypothetical ‘negative US’, has to be computed internally, and the NOI provides a means for its generation.

We propose that the key to reconcile sensory prediction and motor control lies in the nature of the US signal. Considering a graded rather than an all-or-none US signal, the NOI can halt learning once the US intensity drops below a certain safety level, that is, once the US is as mild as to lose its noxiousness. Moreover, this residual signal can play an important functional role, i.e., in a trial-by-trial basis, it can validate the suitability of the anticipatory action. For instance, in the case of AL of the eyeblink response, once the eyelids are closed, perceiving the airpuff confirms the suitability of keep triggering the anticipatory action the next time the CS is perceived.

To summarize, we propose that the NOI allows balancing the level of control between a reactive and an adaptive layer. We validate this proposal in a series of simulations where a robot has to perform a collision avoidance task in a track with a single turn. For the adaptive layer we use a controller based on the anatomy of the cerebellum (Fig. 1) (Eccles et al., 1967). Using the principles behind adaptive filter modeling of the cerebellum, we implement an analysis–synthesis filter with a de-correlation learning rule (Dean, Porrill, Ekerot, & Jörntell, 2010). With this setup, we study the effects of different parametrizations of the NOI gain, showing that it fixes the balance between reactive and adaptive actions, and that besides being required for extinction, the NOI is fundamental for the correct timing of the adaptive responses.

Section snippets

Cerebellar model

The model of the cerebellum consists of a set of parallel cerebellar microcircuits, each one connected to its IO component. Each cerebellar microcircuit encapsulates information processing from cerebellar cortex and cerebellar nuclei together. The inputs displayed in Fig. 2 correspond to the mossy fiber and the climbing fiber pathways, and relay the cue and the error signals, respectively. In nature, the output of a microcircuit module will be carried by the axons of the deep-nuclear cells (for

Acquisition of a response

Before discussing the experimental results, we highlight a key difference between the obstacle avoidance task we use and AL of the eyeblink reflex. From the sensory prediction view, in the eye-blink paradigm the target function for the cerebellar controller to match–the sensory response to the US–varies in amplitude as the adaptive response evolves trial by trial; the noxiousness of the airpuff diminishes as it reaches a more closed eyelid. In the collision avoidance setup both the amplitude of

Discussion

It is suggested that one of the functions of the cerebellum is to replace reflexes by anticipatory avoidance actions (Wolpert, Miall, & Kawato, 1998). In this paper, we proposed that the NOI might allow such a replacement to be partial, resulting in a cooperation between the reactive and adaptive layers of control. To test this hypothesis we built a computational model including a reflex controller–the reactive layer–and a cerebellar controller–the adaptive layer–implementing the latter along

Acknowledgements

Work supported by eSMCFP7-ICT- 270212.

References (42)

  • D. Wolpert et al.

    Internal models in the cerebellum

    Trends in Cognitive Sciences

    (1998)
  • T. Yamazaki et al.

    The cerebellum as a liquid state machine

    Neural Networks

    (2007)
  • C. Yeo et al.

    Cerebellum and conditioned reflexes

    Trends in Cognitive Sciences

    (1998)
  • G. Andersson et al.

    Evidence for a gaba-mediated cerebellar inhibition of the inferior olive in the cat

    Experimental Brain Research

    (1988)
  • G. Andersson et al.

    Activity of purkinje cells and interpositus neurones during and after periods of high frequency climbing fibre activation in the cat

    Experimental Brain Research

    (1987)
  • F. Bengtsson et al.

    Cerebellar control of the inferior olive

    The Cerebellum

    (2006)
  • K. Christian et al.

    Neural substrates of eyeblink conditioning: acquisition and retention

    Learning & Memory

    (2003)
  • R. Clark et al.

    Classical conditioning and brain systems: the role of awareness

    Science

    (1998)
  • P. Dean et al.

    The cerebellar microcircuit as an adaptive filter: experimental and computational evidence

    Nature Reviews Neuroscience

    (2010)
  • J. Eccles et al.

    The cerebellum as a neuronal machine

    (1967)
  • M. Fujita

    Adaptive filter model of the cerebellum

    Biological Cybernetics

    (1982)
  • Cited by (47)

    • Graded error signals in eyeblink conditioning

      2020, Neurobiology of Learning and Memory
      Citation Excerpt :

      Spontaneous complex spike activity during the CS is also suppressed (Rasmussen et al., 2014). Auxiliary evidence that graded error signals exist can be found in computational learning models which shows that learning depends on (Herreros & Verschure, 2013; Rasmussen & Hesslow, 2014), or is more efficient (Bouvier et al., 2018), if error signals are graded. Taken together, these studies suggest that the content of the climbing fiber signal affects learning.

    • Multiple timescales of body schema reorganization due to plastic surgery

      2015, Human Movement Science
      Citation Excerpt :

      Recent models of posture maintenance combine feedback and feed-forward mechanisms (Kuo, 2005) and can be cast in terms of the optimal control framework, in which internal forward models are used for (optimal) state estimation rather than the generation of commands per se (Scott, 2012; Todorov, 2004). Other proposals, augment this scheme with an additional (sensory) internal model that specifically supports the feedback loop by anticipating sensory events (Crevecoeur & Scott, 2013; Herreros & Verschure, 2013). Despite their differences, a recurrent theme in most theories is a dichotomy between two kinds of mechanisms: (1) a feedback mechanism supporting visually-guided execution (closed-loop), which is strongly sensory-dependent and re-adapts quickly; (2) a mechanism that builds an internal model of the task and uses it in various ways: for compensating delays present in feedback processes, state estimation, and/or feed-forward anticipatory control.

    • Feedback Control in the Olivocerebellar Loop

      2021, Handbook of the Cerebellum and Cerebellar Disorders: Second Edition: Volume 3
    View all citing articles on Scopus
    View full text