Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


Recent theories of motor control have proposed that the nervous system acts as a stochastically optimal controller, i.e. it plans and executes motor behaviors taking into account the nature and statistics of noise. Detrimental effects of noise are converted into a principled way of controlling movements. Attractive aspects of such theories are their ability to explain not only characteristic features of single motor acts, but also statistical properties of repeated actions. Here, we present a critical analysis of stochastic optimality in motor control which reveals several difficulties with this hypothesis. We show that stochastic control may not be necessary to explain the stochastic nature of motor behavior, and we propose an alternative framework, based on the action of a deterministic controller coupled with an optimal state estimator, which relieves drawbacks of stochastic optimality and appropriately explains movement variability.

Free full text 


Logo of halLink to Publisher's site
J Comput Neurosci. Author manuscript; available in PMC 2008 Jul 19.
Published in final edited form as:
PMCID: PMC2435534
HALMS: HALMS212327
PMID: 18202922

Optimality, stochasticity, and variability in motor behavior

Abstract

Recent theories of motor control have proposed that the nervous system acts as a stochastically optimal controller, i.e. it plans and executes motor behaviors taking into account the nature and statistics of noise. Detrimental effects of noise are converted into a principled way of controlling movements. Attractive aspects of such theories are their ability to explain not only characteristic features of single motor acts, but also statistical properties of repeated actions. Here, we present a critical analysis of stochastic optimality in motor control which reveals several difficulties with this hypothesis. We show that stochastic control may not be necessary to explain the stochastic nature of motor behavior, and we propose an alternative framework, based on the action of a deterministic controller coupled with an optimal state estimator, which relieves drawbacks of stochastic optimality and appropriately explains movement variability.

Keywords: Motor control, noise, model

1. Introduction

Despite multiple levels of redundancy, noisy sensors and actuators, and the complexity of biomechanical elements to be controlled, the nervous system elaborates well-coordinated movements with disconcerting ease (Bernstein 1967). In fact, Bernstein (1967) observed that a motor goal can be successfully reached although each attempt to reach this goal has unique, nonrepetitive characteristics. To succeed in this daunting control task, powerful mechanisms should be at work in brain circuits. Their properties should encompass the capacity: 1. to reach a goal with little error and small energy expenditure, i.e. to choose an appropriate set of motor commands among an infinite number of solutions (degrees-of-freedom problem); 2. to face deterministic (e.g. change in goal, force applied on the moving limb) and stochastic (e.g. noise in motor commands) perturbations (variability problem).

The Bernstein problem which encompasses both the degrees-of-freedom and variability problems, is illustrated in Fig. 1 for a reaching movement. In this example, the moving arm has three degrees of freedom (Fig. 1A; shoulder, elbow, wrist), and moves in a two-dimensional space to reach a target (Fig. 1B). Thus there exists an infinite number of articular displacements which are appropriate to capture the target (Fig. 1C). In the presence of noise, the reaching movements are successful, but have different characteristics (Fig. 1D). Since movements can be realized with or without visual feedback (Fig. 1E, F), processes related to state estimation and multimodal integration are necessary for accurate motor control.

An external file that holds a picture, illustration, etc.
Object name is halms212327f1.jpg

Illustration of the Bernstein problem. A. Planar reaching movement with a redundant arm (3 DOF). B. A successful movement reaches the target region (central gray circle). C. Two successful movements with different final postures. D. Several successful movements with different spatiotemporal characteristics. Inset: velocity profiles. E. Movement with visual feedback (from the target and the moving arm) and proprioceptive feedback (from the muscles). F. Movement without visual feedback from the moving arm.

Elements of the Bernstein problem have been synthesized in part in a theory of motor control based on the engineering tool of stochastic optimal control (Harris and Wolpert 1998). In this framework, motor controllers in the brain would choose optimal command signals that minimize the influence of noise on the achievement of motor goals (MV, minimum-variance model; Harris and Wolpert 1998; Hamilton and Wolpert 2002; van Beers et al. 2004). By construction, such a theory represents a radical departure from most previous optimal control models in the sense that characteristics of motor behavior emerge from a general principle rather than from a level-specific (e.g. kinematic, dynamic, muscular), effector-specific (e.g. arm, eye) or task-specific (e.g. posture, locomotion, …) criterion (see Todorov 2004 for a review). Furthermore, it accounts not only for level-specific (e.g. typical bell-shaped velocity profiles, triphasic electromyographic signals), effector-specific (arm movements, saccades) and task-specific (point-to-point movements, drawing movements, obstacle avoidance) properties, but also for amplitude/duration scaling and speed-accuracy trade-off (Fitts’ law) inherent to the functioning of motor systems.

Despite these striking successes, it appears difficult to hypothesize that motor control is purely an open-loop process (Desmurget and Grafton 2000). This observation led Todorov and Jordan (2002) to propose that motor behavior results from the action of a stochastic optimal feedback controller (SOFC), i.e. a controller which elaborates online motor commands taking into account actual or estimated state of the motor apparatus and the statistics of noise. Optimality arises from the simultaneous minimization of error (e.g. distance to the goal) and effort (e.g. size of the commands). Although MV and SOFC can be considered to be similar on the surface, the presence of feedback processes renders SOFC much more versatile. In particular, it can account for the emergence of uncontrolled manifolds (Scholz and Schöner 1999; Scholz et al. 2000), i.e. the fact that variability is preferentially reduced along dimensions that interfere with task requirements (a phenomenon called structured variability; Todorov 2004). For instance, if a subject is asked to point on a line, movement endpoints are scattered along the target line (Scholz et al. 2000). More generally, it provides a principled approach to the construction of motor acts in the presence of noise and perturbations which closely corresponds to experimental observations (Todorov and Jordan 2002).

Although attractive, stochastic feedback optimality is a complex theoretical construct, and should not be considered as a default hypothesis. In fact, due to its central role in models of motor control (Todorov and Jordan 2002; Saunders and Knill 2004; Chhabra and Jacobs 2006a), it merits to be questioned (Schaal and Schweighofer 2005). In particular, SOFC has been mostly used for the control of linear systems, and although results have also been obtained in a nonlinear case (shoulder/elbow arm with nonlinear muscles; Todorov and Li 2005; Li 2006), the general problem of kinematic redundancy, which is a central issue for Bernstein, has not been addressed in this framework. In this article, we present a critical analysis of stochastic feedback optimality to assess whether this hypothesis is appropriate to explain characteristics of motor control. This analysis led us to show that SOFC does not provide a satisfactory solution to the degrees-of-freedom problem, and to propose an alternative approach to motor control. This approach is based on a model (a terminal optimal feedback controller, TOFC) which provides a quantitative account of the degrees-of-freedom problem (Guigon et al. 2007) (see Section 3 for more details). Our purpose here is to show that TOFC is also able to master stochastic control problems, and can thus be considered as a unified model of motor control.

2. Stochastic optimal feedback control

SOFC is an approach to motor control which combines stochastic optimality and feedback control. The reader is referred to Todorov (2005) for a thorough introduction to SOFC (see also Appendix A for a brief survey). A central idea of SOFC is the emergence of optimal behaviors through minimization of stochastic quantities related to states and controls (error/effort cost function). However, this form of optimization cannot in general guarantee that kinematic goals are appropriately reached, i.e. the actual final state of a simulated movement will not necessarily be equal to the desired final state representing the goal of the movement. The problem arises from the minimization of the mixed error/effort cost. Such a minimization requires the setting of parameters which weight the contribution of state errors (velocity, force, …; parameters wv, wf in Todorov 2005) and effort (r) in the cost function. Each setting will lead to a particular time course of states along the movement, and a particular pattern of constant and variable terminal errors. To illustrate, we consider the shape of velocity profiles for point-to-point movements simulated as described in Todorov and Jordan (2002). Different profiles were found for different values of r, wv, and wf (Fig. 2). Although differences between the profiles could be considered as insignificant, this result raises the question of what is the setting of these parameters which defines a “normal” velocity profile to be compared with experimental observations? SOFC offers no answer to this question. Todorov and Jordan (2002) recognized that these parameters must be adjusted to each task at hand (their supplementary information). Todorov (2005) proposed to set the position and velocity weights according to movement amplitude and movement time. This issue is crucial to address kinematic invariance (i.e. the invariant shape of velocity profiles; e.g. Atkeson and Hollerbach 1985; Gordon et al. 1994b). We consider a second example. Programming a grasping movement with SOFC requires to simultaneously minimize the distance between the hand and the object, and angular difference between hand and object orientation. The parameter which weights the two errors should influence the time course of error reduction along a movement. Coarticulation (i.e. the concurrent reduction of distance and orientation errors; Torres and Zipser 2004) may or may not be observed depending on the value of this parameter. In fact, there is no uniquely defined emergent kinematic property in SOFC. Thus, although SOFC can eliminate redundant degrees of freedom, it cannot do it in a principled way.

An external file that holds a picture, illustration, etc.
Object name is halms212327f2.jpg

Mean normalized velocity profiles for 10 cm, 300 ms movements simulated with SOFC. Mean was calculated over 500 trials (σSDNm = 0.1). Parameters were r = 1, wv = 4, wf = 4 (1st profile), r = 1, wv = 0.04, wf = 0.04 (2nd profile), r = 10, wv = 0.4, wf = 0.04 (3rd profile).

Despite this problem, we cannot easily abandon a model which has proven highly efficient in other respects (Todorov and Jordan 2002). In particular, the minimal intervention principle, which predicates that variability is preferentially reduced along dimensions that interfere with task goal (Todorov and Jordan 2002), is a central concept to explain the structure of motor variability (Scholz and Schöner 1999; Scholz et al. 2000; Todorov and Jordan 2002). To resolve this difficulty, we addressed the origin of the minimal intervention principle in SOFC. Todorov and Jordan (2002) proposed that this principle derives from optimal compensation for signal-dependent motor noise (SDNm; for definition, see Harris and Wolpert 1998; Todorov and Jordan 2002; in Appendix, Eq. A.1, process noise). However, a SOFC is a complex mathematical object, which contains cost-, task-, and noise-related terms (Eq. A.4 and Eq. A.5), and the specific importance of the different terms has not been assessed. In particular, the contribution of noise-related terms, which provide knowledge on the structure (e.g. statistics, correlations) of noise is unclear.

In their line-pointing simulation, Todorov and Jordan (2002) illustrated the emergence of an uncontrolled manifold (UM; Scholz et al. 2000): variability was preferentially oriented along the target line, i.e perpendicularly to the task-error dimension. We explored the origin of the UM in SOFC. We observed that no UM was found in the absence of SDNm (for definition, see Eq. A.1 and text below in Appendix A). We found that the UM arose in the presence of SDNm in any of the following conditions (Fig. 3A, B): (1) the statistics of SDNm are known to the controller and estimator (presence of terms with [C1Cc] in Eq. A.4 and Eq. A.5); (2) the statistics of SDNm are known only to the estimator (presence of terms with [C1Cc] only in Eq. A.5); (3) the statistics of SDNm are unknown (no terms with [C1Cc] in Eq. A.4 and Eq. A.5), but the statistics of other noises are known (e.g. SINm; presence of Ωζ in Eq. A.5). Alternatively, no UM was found when the statistics of SDNm were known only to the controller or completely unknown and no other types of noise were known (absence of terms with Ωζ, Ωω and Ωε). In fact, a common qualitative characteristic of appropriate (inappropriate) conditions is the efficient (deficient) functioning of the state estimator, i.e. the fact that the estimator provides an accurate (inaccurate) estimate of the true state (Fig. 3B, C). To understand this result, we have rewritten the equation of the state estimator (Eq. A.5) in the absence of signal-dependent noises (absence of Ωε and Ωε)

An external file that holds a picture, illustration, etc.
Object name is halms212327f3.jpg

Conditions for the formation of an uncontrolled manifold in SOFC. A. Variation in the aspect ratio of the terminal variability ellipse as a function of σSDNm. The task was a line-pointing task. Movement duration was T=500 ms and distance to the line was 30 cm. The aspect ratio was calculated as the ratio between major and minor axis length of the 95% equal frequency ellipse calculated over 5,000 trials. Symbols: circle (SDNm known in the controller and estimator, SINm unknown); square (SDNm known in the estimator only, SINm unknown); diamond (SDNm unknown, SINm unknown); up triangle (SDNm unknown, SINm known). In each case, SINs was present (σSINs = 0.3) and known. B. Example of an uncontrolled manifold (circle in A; σSDNm = 0.4). The 95% equal frequency ellipse, 50 endpoints and 10 trajectories are shown. Inset: sample trajectory and velocity profile (black solid: actual; gray dashed: estimated by the Kalman filter). C. Absence of UM (diamond in A; σSDNm = 0.4).

{Kt=AteHT(HteHT+Ωω)1t+δe=Ωξ+(AKtH)teAT

If we assume that 0e=0 (in fact, the null matrix), i.e. there is no uncertainty on the initial state of the system, the above equation will lead to zero Kt when Ωζ = 0, or undefined K0 when Ωω = 0. A similar reasoning can be done for the terms involving the signal-dependent noises.

This observation was confirmed in a series of simulations in which the different types of noise (SINm, SINs, SDNs) and knowledge of initial state statistics were varied. We also simulated the via-point task of Todorov and Jordan (2002). As expected (Fig. 4), structured variability was observed when the estimator was efficient even if the statistics of SDNm were unknown, and unstructured variability was found when the state estimator was inefficient. These results support the contention of Todorov and Jordan (2002) that structured variability results from optimal feedback control in the presence of SDNm. However, they question the idea that the controller and estimator need to know the statistics of this noise. These observations are restricted to the framework of SOFC, and do not preclude the emergence of uncontrolled manifolds in the absence of signal-dependent noise in other frameworks.

An external file that holds a picture, illustration, etc.
Object name is halms212327f4.jpg

Structure of variability in a via-point task. A. The task was to go from P0 = (0, 0) to P4 = (a, 0) going through 3 points: P1 = (a/4, b), P2 = (a/2, 0), P3 = (3a/4, b). The passage times are (0, t1, t2, t3, T). Parameters were: a = 20 cm, b = 5 cm, T=1.6s, t1=T/4, t2=T/2, t1=3T/4. B. Plain line (SDNm known in the controller and estimator); dashed line (SDNm known in the estimator only); dotted line (SDNm known in the controller only). In each case, σSDNm = 0.4, SINs was present (σSINs = 0.1) and known, and SINm was known. The gray line shows the absence of structured variability when both SDNm and SINm were unknown (σSDNm = 0.02). Variability has arbitrary units.

Taken together, these results indicate that SOFC is efficient due to its optimal feedback component, but deficient due to its cost function. A solution to this difficulty can be found in Nelson (1983). According to Nelson, skilled movements are built to satisfy both specific task-oriented objectives (measured, e.g., by errors) and a general “effort” objective. However, unlike in SOFC, task objectives can be considered as constraints (“hard” constraints) and not exclusively as costs, i.e. the effort objective is minimized only for cases when the constraints are satisfied (error is zero). In technical terms, we can consider a terminal optimal feedback controller (TOFC) rather than an optimal regulator (Bryson and Ho 1975). An open question is whether a terminal controller can appropriately master a stochastic problem.

3. Terminal optimal feedback control

We define a terminal optimal feedback controller in the presence of noise as a controller which plans optimal trajectories from ongoing estimated state to the target. Each trajectory is optimal in the sense that it is a series of optimally planned submovements. Since optimization operates on each single trajectory, but not across trajectories, the model is not optimal in a stochastic sense. We note that a TOFC is not a new type of controller, but in fact a classical controller in the engineering literature (Bryson and Ho 1975; see also Hoff and Arbib 1993 for a related model applied to motor control). The theory of TOFC is in fact the theory of optimal control with terminal constraints which is explained formally in Appendix B and for practical applications in the linear case in Appendix C. The theory was described with a general cost function (Eq. B.2). Actually, a quadratic function of controls, similar to the effort term of the cost function in SOFC (Eq. A.2), was used in the simulations, i.e.

L[x(t),u(t)]=u(t)2.

We first note that SOFC and TOFC have a qualitatively similar behavior at the level of individual movements. For instance, for point-to-point movements, they generate straight trajectories with bell-shaped velocity profiles. The main difference is that results obtained with TOFC do not depend on parameters (such as r, wv, wf in SOFC). We have shown previously that TOFC is appropriate to provide uniquely defined emergent kinematic properties for kinematically redundant problems (Guigon et al. 2007). Briefly, the model gives a quantitative account of trajectories (e.g. curvature), velocity profiles, and final postures of pointing and grasping movements, and explains kinematic invariance for amplitude and load.

We replicated the preceding simulations on the structure of variability with TOFC. There is no difficulty for the line-pointing task. Emergence of the uncontrolled manifold is shown in Fig. 5A, B. The tests described in Fig. 5 gave similar results with TOFC. The case of the via-point task is more problematic. There are two ways to force a movement through via-points. On the one hand, distance to these points can be introduced in the cost function of the problem. This solution is similar to the mixed error/effort function of SOFC and is not satisfactory. On the other hand, the via-points can be entered as constraints similar to initial and final positions. Again, this method is not fully convincing as, in the presence of noise, we need to know at each time which via-points remain to be considered. To circumvent this difficulty, we assumed that the trajectory successfully passes through a via-point if the estimated position passes close enough to the via-point (i.e. in an arbitrary region around the via-point; see Figure caption for details). On this basis, we replicated the structured variability in the via-point task in Fig. 5C. Yet, the question remains how such a task can be appropriately modeled in SOFC or TOFC framework.

An external file that holds a picture, illustration, etc.
Object name is halms212327f5.jpg

A. Variation in the aspect ratio of the terminal variability ellipse as a function of σSDNm in the line-pointing task with TOFC. B. Example of an uncontrolled manifold (same movement as in Fig. 3B). σSDNm = 0.4, σSINs = 0.8 (known). C. Same as in Fig. 4 for TOFC (σSDNm = 0.4, σSINs = 0.4). The task was solved in the following way. We decided that a via-point was reached when the estimate position of the system enters a 5 mm-radius circle around the via-point. Starting from P0 at t = 0, we computed the optimal trajectory which goes through P1 at t1 and through P2 at t2. The trajectory evolved and approached P1. Once P1 was reached (according to the preceding criterion), we computed the optimal trajectory which goes through P2 at t2 and through P3 at t3. The procedure was repeated until P4 was reached.

These results provide further support to the analysis of the preceding section. Structured variability can be obtained with an optimal feedback controller which is unaware of noise.

4. Amplitude/duration scaling

An important issue is the possible origin of amplitude/duration scaling in stochastic optimal control models. Scaling can result from time minimization to match a given level of terminal variability (Meyer et al. 1988; Harris and Wolpert 1998). However, this solution predicts that scaling is associated with constant terminal variability. Experimental observations show that variability can increase with movement amplitude for series of movements obeying an amplitude/duration scaling law (Gordon et al. 1994a). Here, we explore an alternative (but not mutually exclusive) solution to scaling in the framework of TOFC, based on time minimization to match a given level of effort.

We first consider control in the absence of noise. In this case, there exists a monotonic (and thus invertible) relationship between the effort associated to a movement and its duration for a given amplitude (Fig. 6A). Thus a movement can be univocally specified by its effort level. This property is formally stated as follows. The ongoing effort can be used as an additional state variable (z; Appendix B). Specification of movement duration ( T) can be replaced by specification of total effort ( z(T)=zT) which is a classical boundary condition. Movement duration emerges from an optimal control problem with open final time (Bryson and Ho 1975). This open final time terminal controller is also an optimal feedback controller if the effort-to-go at each processing step (calculated as the difference between the total effort zT and the already spent effort) is used as an initial condition and by the way as an implicit indication of the remaining time. In this framework, amplitude/duration scaling occurs when an amplitude/effort relationship is chosen. The simplest relationship is a constant effort (although other relationships could be used; see Discussion). We applied this relationship to movements of different amplitudes and online corrections of these movements (Fig. 6B; Pélisson et al. 1986). It predicted amplitude/duration scaling for the unperturbed movements and a linear change in durations for the corrected movements (Fig. 6C).

An external file that holds a picture, illustration, etc.
Object name is halms212327f6.jpg

Movements and on-line movement corrections at constant effort in TOFC. A. Movement effort as a function of movement duration for different amplitudes (from bottom to top: 30, 40, 50, 60 cm). The horizontal gray line indicates the effort level used in B, and C. B. Velocity profiles of a normal (gray) and a perturbed (black) movement. Amplitude was 30 cm and direction was 45°. At t = 50 ms, the target was displaced by 4 cm in the direction of movement. C. Variations in movement duration with amplitude for normal (○) and perturbed (square) movements (an arrow indicates the direction of the perturbation). Movements of 30, 40, 50 cm were simulated (45°). Same perturbation as in B.

In the presence of noise, amplitude and total effort level are deterministic quantities which are used initially as boundary conditions. In each single trial, the effort-to-go is a well defined quantity which is used to determine the remaining time at each step. Thus the functioning of the open final time TOFC is similar in noise-free and noisy conditions. The sole difference is that, across trials, the effort-to-go is a random variable in the latter condition, and so is the movement time. We simulated the open final time TOFC in the presence of noise. The constant effort condition led to the expected amplitude/duration scaling (Fig. 7A) and amplitude/peak velocity scaling (Fig. 7B). Here, the duration and peak velocity are mean quantities. We observed that terminal variability (s.d.) varied linearly with peak velocity (Fig. 7C) as was observed experimentally (Meyer et al. 1988; Burdet and Milner 1998; Novak et al. 2000). For comparison, we replicated these simulations for nonconstant effort conditions (Fig. 7A, inset). These conditions also produced amplitude/duration and amplitude/peak velocity scaling (Fig. 7A,B), but nonlinear changes in terminal variability with peak velocity (Fig. 7C).

An external file that holds a picture, illustration, etc.
Object name is halms212327f7.jpg

Open final time TOFC in the presence of noise (σSDNm = 0.5, σSINs = 0.15, σSDNs = 0.5). A. Amplitude/mean duration scaling for constant effort (circle) and nonconstant effort (open and closed square) conditions. Inset: amplitude/effort for the 3 conditions. B. Amplitude/mean peak velocity scaling for the data in A. C. Changes in terminal accuracy (measured as the square root of the surface of the variability ellipse) with peak velocity for the data in A. D. Pattern of variability for unperturbed movements in Fig. 6C in different noise conditions (black: σSDNm = 0.2, σSINs = 0.5, σSDNs = 1; gray: σSDNm = 0.2, σSINs = 0.05, σSDNs = 1).

We also observed that scaling was not associated with a specific pattern of terminal variability (Fig. 7D). In one simulation (black lines), variability increased with movement amplitude, but other patterns can be found in other noise conditions (gray lines). We note that the goal here was not to account for a particular pattern of variability (e.g. Gordon et al. 1994a; van Beers et al. 2004), but simply to illustrate the dissociation between scaling and variability.

5. Discussion

Influence of noise is central to current approaches of motor control (Harris and Wolpert 1998; Todorov and Jordan 2002; Saunders and Knill 2004). A critical issue is the role of noise in the emergence of motor behaviors. Todorov and Jordan (2002) proposed that motor controllers in the brain act as stochastic optimal feedback controllers (SOFCs), and provided strong theoretical and experimental arguments that support this idea. The main difficulty with this proposal is that a SOFC has no kinematic competency. Historically, optimality principles have been used in the framework of motor control to identify unique solutions to redundant problems (trajectory formation, muscle force repartition, …). Since SOFC optimizes a parameter-dependent cost function, it generates an infinite number of reasonable solutions to redundancy. A supplementary principle is needed to choose among these solutions. Todorov (2005) proposed to scale the parameters of the cost function with movement amplitude and duration. Although this scaling is probably efficient, it means that the model should contain some knowledge of its functioning (i.e. how biases, trajectories, velocity profiles depend on the parameters), or some criterion to evaluate its functioning (what is a normal bias, a normal trajectory or a normal velocity profile?). This point contradicts a major premise of the SOFC model that motor behavior arises from the specification of a behavioral goal.

To circumvent this difficulty, we tested the idea that a different type of controller (TOFC), which treats errors and effort separately, could be used to obtain both kinematic and stochastic competencies. On the one hand, TOFC generates unique kinematic behaviors since its cost function has no parameters (Guigon et al. 2007). It should be noted here that satisfying a hard constraint (zero terminal error) does not preclude the existence of a terminal bias. In fact, errors are measured relative to the estimated state of the system which need not in general be similar to the real state. On the other hand, our results show that structured variability (as defined by Todorov and Jordan 2002) could result from the action of a deterministic controller coupled with an optimal state estimator. These results are not sufficient to conclude that TOFC has a real stochastic competency. It is possible that a truly stochastic controller is necessary to account for motor variability in some experimental conditions. Although we cannot reject this possibility, our analysis shows that a critical component which allows a stochastic controller to master a stochastic system, i.e. an efficient state estimator, is also present in TOFC. For the time being, we can conclude that TOFC has more kinematic competency, but not less stochastic competency than SOFC.

A critical issue for models of motor control is to explain the scaling between amplitude and duration. In TOFC, scaling occurs for movements which have the same effort. This idea is closely related to the emergence of scaling for movements of identical terminal variance in the MV model of Harris and Wolpert (1998). The main difference between effort and terminal variance is the variability pattern prescribed by the scaling law. In the latter case, variability is, by construction, constant. In the former case, variability changes with movement amplitude with a pattern which depends on the structure of noise. Constant variance and constant effort are the same criterion in MV in the presence of SDNm. In this case, the optimal command is the smaller command since noise increases with the size of the command. In TOFC, constant variance and constant effort are different criteria, which suggests that this framework could be more appropriate than MV to explain amplitude/duration scaling.

The proposal that scaling is associated with constant effort (or constant variability in the MV model) was made for simplicity. However, many different relationships between amplitude and effort lead to scaling. The main interest of the constant effort is its ecological interpretation: it can be considered as the largest effort that can be alloted to a single motor act in a series (e.g. experimental session, day, race, …) given the number of repetitions of this act and the available resources (control, energy, …). Furthermore, constant effort predicts that variability increases linearly with peak velocity (Meyer et al. 1988; Milner 1998; Novak et al. 2000). This relationship is in general nonlinear when the effort is not constant. An interesting alternative to explain scaling is simultaneous minimization of time and effort (Hoff 1994) using

T+ρ0Tu(t)Tu(t)   dt
(1)

as a cost function. Here, ρ is a parameter which sets the trade-off between time and effort, and defines an amplitude/duration relationship. The two approaches lead to quite similar results. Yet, the specification of effort appears more principled than the specification of a hidden parameter (ρ).

Our results show that TOFC is an interesting alternative to SOFC. On the one hand, the separation of constraints (error) and objectives (effort) relieves the difficulties of the mixed error/effort cost. On the other hand, although it is not stochastically optimal, TOFC can account for the structure of motor variability much like SOFC. Thus TOFC could be an appropriate framework for a unified approach to motor control which would simultaneously account for mean characteristics of motor behavior (e.g. kinematic invariants; Guigon et al. 2007) and structure of motor variability. More generally, TOFC provides a principled solution to the Bernstein problem. Interestingly, this problem raises fundamental questions in the framework of computational neuroscience: How does the nervous system tackle redundancy (Wolpert and Ghahramani 2000)? What is the nature and influence of noise on sensory and motor information processing (Meyer et al. 1988; Harris and Wolpert 1998; van Beers et al. 2004; Stein et al. 2005)? How does the nervous system control motor behavior in the presence of noise (Meyer et al. 1988; Harris and Wolpert 1998; Todorov 2004; Trommershäuser et al. 2005; Chhabra and Jacobs 2006b)? How does the nervous system perform state estimation and multimodal integration on noisy information (Wolpert et al. 1995; van Beers et al. 1999; Knill and Pouget 2004)? Thus the Bernstein problem is a fundamental computational problem that goes far beyond motor control. The present results should be of interest in a broad framework which encompasses experimental and theoretical studies of behavioral variability.

A. SOFC

Derivation of the following results is found in Todorov (2005). We consider the stochastic optimal feedback control problem defined by the noisy dynamics

{xt+δ=Axt+But+ntp(process)yt=Hxt+nto(observation)x^t+δ=Ax^t+But+Kt(ytHx^t)(estimation)ntp=ξt+i=1cεtiCiut(process noise)nto=ωt+i=1dεtiDixt(estimation noise)
(A.1)

where xt [set membership] Rn is the state of the controlled system, ut [set membership] Rm a control signal, A the n × n process matrix, B the n × m control matrix, t=(0,δ,,Nδ=T), δ the discretization timestep, N the number of time steps, T the duration of process simulation, yt [set membership] Rp the observation vector, H the p×n observation matrix, x̂t the state estimate, Kt the Kalman gain, ζt a n-dimensional zero-mean Gaussian random vector with covariance matrix Ωζ (signal-independent motor noise; SINm), εt=[εt1εtc] a zero-mean Gaussian random vector with covariance matrix Ωε (signal-dependent motor noise; SDNm), [C1Cc] a set of n×m matrices, ωt a p-dimensional zero-mean Gaussian random vector with covariance matrix Ωω (signal-independent sensory noise; SINs), εt=[εt1εtd] a zero-mean Gaussian random vector with covariance matrix Ωε (signal-dependent sensory noise; SDNs), [D1Dd] a set of p × n matrices, and the cost function

E[t=0TxtTQtxterror+t=0TutTRut]effort,
(A.2)

where E is the expectation over noise (SINm, SDNm, SINs, SDNs), Qt a task-specific error matrix and R an effort penalty matrix. The symbol (T) denotes the transpose of a vector or a matrix.

The controller is

ut=Ltx^t,
(A.3)

where

{Lt=(R+BTSt+δxB+iCiTΩεT(St+δx+St+δe)ΩεCi)1BTSt+δxAStx=Qt+ATSt+δx(ABLt)+iDiTΩεTKTSt+δeKtΩεDiSte=ATSt+δxBLt+(AKtH)TSt+δe(AKtH)
(A.4)

with SNx=QT and SNe=On×n (null n × n matrix). The Kalman filter was

{Kt=AteHT(HteHT+Ωω+iΩεDi(te+tx^+tx^e+tex^)DiTΩεT)1t+δe=Ωε+(AKtH)teAT+iΩεCiLttx^LtTCiTΩεTt+δx^=KtHteAT+(ABLt)tx^(ABLt)T+(ABLt)tx^eHTKtT+KtHtex^(ABLt)Tt+δx^e=(ABLt)tx^e(AKtH)T
(A.5)

with 0e=0,0x^=x^0x^0T, and 0x^e=On×n. See Todorov (2005) for the iterative solution to Eq. A.4 and Eq. A.5.

B. Optimal control with terminal constraints: Formal

Here, we briefly recall some textbook notions on optimal control problems with terminal constraints in the general nonlinear case (Bryson 1999) and nonlinear state estimation (Goodwin and Sin 1984).

Formulation of the problem

We consider a dynamical system

x.(t)=f[x(t),u(t)]
(B.1)

where x [set membership] Rn is the state of the system and u [set membership] Rm a control vector. An optimal control problem for this system is to find the control vector u(t) for t [set membership] [t0; tf] to minimize a performance index

J=φ[x(tf)]+t0tfL[x(t),u(t)]dt
(B.2)

subject to Eq. B.1, with boundary conditions

x(t0)=x0         ψ[x(tf)]=0.
(B.3)

We consider the optimal control problem defined by Eq. B.1, Eq. B.2 and Eq. B.3. We consider the supplementary state variable z defined by

z.(t)=L[x(t),u(t)]

and z(t0) = 0. Thus z(tf) is the second part of the performance index (Eq. B.2). We define the new state variable

x(t)=(z(t)x(t)).

We can reformulate the optimal control problem in the following way: find the control vector u(t) to minimize

J=φ¯[x(tf)]=(z(tf)φ[x(tf)])
(B.4)

subject to

x.(t)=f[x(t),u(t)]=(L[x(t),u(t)]f[x(t),u(t)])
(B.5)

and

x(t0)=x0=(0x0)         ψ[x(tf)]=(0ψ[x(tf)])=0.
(B.6)

Thus we can remove the integral term in the performance index. This formulation (Mayer formulation) is simpler for numerical methods.

Solution

Here, we consider the optimal control problem defined by Eq. B.4, Eq. B.5 and Eq. B.6. For simplicity, we remove the tilde sign. We adjoin the constraints to the performance index with Lagrange multipliers ν and λ(t)

Jˉ=φ+νTψ+t0tfλT(t){f[x(t),u(t)]x.(t)}   dt.

The Hamiltonian function is

H[x(t),u(t),λ(t)]=H(t)=λT(t)f[x(t),u(t)].

The generalized performance index can be written

Jˉ=Φ[x(tf)]λT(tf)x(tf)+λT(t0)x(t0)+t0tf{H(t)+λ.T(t)x(t)}   dt

following integration of the λT x by parts, where Φ = [var phi] + νTψ.

A variation of Jˉ writes

δJˉ=[(ΦxλT)δx]t=tf+[λTδx]t=t0+t0tf[(Hx+λ.T)δx+Huδu]   dt

for variations δx(t) and δu(t). The Lagrange mutlipliers are chosen so that the coefficients of δx(t) and δx(tf) vanish

λ.T=Hx=λTfx,
(B.7)

with boundary conditions

λT(tf)=φx(tf)+νTψx(tf).
(B.8)

For a stationary solution, δJˉ=0 for arbitrary δu(t), which implies

Hu=λTfu=0         t0ttf.
(B.9)

The problem defined by Eq. B.1, Eq. B.7, Eq. B.8 and Eq. B.9 is a two-point boundary value problem which can be solved by classical integration methods (Bryson 1999).

Terminal feedback control and Extended Kalman filter

In the stochastic case, equation B.1 becomes

x.(t)=f[x(t),u(t),ξ(t),ε(t)]
(B.10)

and observation follows from

y(t)=h[x(t),ω(t),ε(t)],
(B.11)

To obtain a state estimator for Eq. B.10 and Eq. B.11, we need an extended Kalman filter (EKF), which is an extension of the Kalman filtering principle for optimal nonlinear estimation. The EKF retains the linear calculation of the covariance and gain matrices of the Kalman filter, and updates the state estimate using a linear function of a filter residual. State propagation is done using the original nonlinear equation. Evolution of the covariance matrix P(t) (n × n) is governed by

P.(t)=F(t)P(t)+P(t)F(t)T+Ωξ+GΩεGTK(t)HP(t)         P(t0)=P0
(B.12)

where

F(t)=f[x(t),u(t),ξ(t),ε(t)]x,G(t)=f[x(t),u(t),ξ(t),ε(t)]ε=([Ciu(t)]T).

and K(t) is the Kalman gain

K(t)=P(t)HT[Ωω+J(t)ΩεJ(t)T]1,

with

J(t)=h[x(t),ω(t),ε(t)]ε=([Dix^(t)]T).

State propagation was governed by

x^.(t)=f[x^(t),u(t)]+K(t)[y(t)Hx^(t)]

with

x^.(t0)=x^0.

C. Optimal control with terminal constraints: Practical

In the linear case, the problem defined by Eq. B.7, Eq. B.8 and Eq. B.9 is a first-order linear dynamical system which can be solved explicitly. The solution consists in a 2n × 2n matrix Γ(t) such that

(x(t)λ(t))=Γ(t)C
(C.1)

is the solution at time t, where C2n is a vector determined by the boundary conditions (Eq. B.3). To simplify we use ψ [x(tf)] = x(tf) − xf, but more complex boundary conditions can be handled as well. To obtain C, we write

(x0λ(t0))=Γ(t0)C=(Γ11(t0)Γ12(t0)Γ21(t0)Γ22(t0))(C1C2)

and

(xfλ(tf))=Γ(tf)C=(Γ11(tf)Γ12(tf)Γ21(tf)Γ22(tf))(C1C2).

Thus

(Γ11(t0)Γ12(t0)Γ11(tf)Γ12(tf))(C1C2)=(x0xf),

which gives

C=(Γ11(t0)Γ12(t0)Γ11(tf)Γ12(tf))1(x0xf).
(C.2)

A discretized version of the EKF was used with

Kt=FtPtHT(HPtHT+Ωω+JtΩεJtT)1,
(C.3)

and

Pt+1=FtPtFtT+Ωξ+GtΩεGtTKt(HPtHT)KtT.
(C.4)

References

  • Atkeson C, Hollerbach J. Kinematic features of unrestrained vertical arm movements. J Neurosci. 1985;5(9):2318–2330. [Abstract] [Google Scholar]
  • Bernstein N. The Co-ordination and Regulation of Movements. Oxford: Pergamon Press; 1967. [Google Scholar]
  • Bryson A. Dynamic Optimization. Englewood Cliffs, NJ: Prentice Hall; 1999. [Google Scholar]
  • Bryson A, Ho YC. Applied Optimal Control - Optimization, Estimation, and Control. New York: Hemisphere Publ Corp; 1975. [Google Scholar]
  • Burdet E, Milner T. Quantization of human motions and learning of accurate movements. Biol Cybern. 1998;78(4):307–318. [Abstract] [Google Scholar]
  • Chhabra M, Jacobs R. Near-optimal human adaptive control across different noise environments. J Neurosci. 2006a;26(42):10883–10887. [Abstract] [Google Scholar]
  • Chhabra M, Jacobs R. Properties of synergies arising from a theory of optimal motor behavior. Neural Comput. 2006b;18(10):2320–2342. [Abstract] [Google Scholar]
  • Desmurget M, Grafton S. Forward modeling allows feedback control for fast reaching movements. Trends Cogn Sci. 2000;4(11):423–431. [Abstract] [Google Scholar]
  • Goodwin G, Sin K. Adaptive Filtering Prediction and Control. Englewood Cliffs, NJ: Prentice-Hall; 1984. [Google Scholar]
  • Gordon J, Ghilardi M, Cooper S, Ghez C. Accuracy of planar reaching movements. I. Independence of direction and extent variability. Exp Brain Res. 1994a;99(1):97–111. [Abstract] [Google Scholar]
  • Gordon J, Ghilardi M, Cooper S, Ghez C. Accuracy of planar reaching movements. II. Systematic extent errors resulting from inertial anisotropy. Exp Brain Res. 1994b;99(1):112–130. [Abstract] [Google Scholar]
  • Guigon E, Baraduc P, Desmurget M. Computational motor control: Redundancy and invariance. J Neurophysiol. 2007;97(1):331–347. [Abstract] [Google Scholar]
  • Hamilton A, Wolpert D. Controlling the statistics of action: Obstacle avoidance. J Neurophysiol. 2002;87(5):2434–2440. [Abstract] [Google Scholar]
  • Harris C, Wolpert D. Signal-dependent noise determines motor planning. Nature. 1998;394:780–784. [Abstract] [Google Scholar]
  • Hoff B. A model of duration in normal and perturbed reaching movement. Biol Cybern. 1994;71(6):481–488. [Google Scholar]
  • Hoff B, Arbib M. Models of trajectory formation and temporal interaction of reach and grasp. J Mot Behav. 1993;25(3):175–192. [Abstract] [Google Scholar]
  • Knill D, Pouget A. The bayesian brain: The role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27(12):712–719. [Abstract] [Google Scholar]
  • Li W. Ph D thesis. University of California, San Diego: 2006. Optimal control for biological movement systems. [Google Scholar]
  • Meyer D, Abrams R, Kornblum S, Wright C, Smith J. Optimality in human motor performance: Ideal control of rapid aimed movement. Psychol Rev. 1988;95(3):340–370. [Abstract] [Google Scholar]
  • Nelson W. Physical principles for economies of skilled movements. Biol Cybern. 1983;46(2):135–147. [Abstract] [Google Scholar]
  • Novak K, Miller L, Houk J. Kinematic properties of rapid hand movements in a knob turning task. Exp Brain Res. 2000;132(4):419–433. [Abstract] [Google Scholar]
  • Pélisson D, Prablanc C, Goodale M, Jeannerod M. Visual control of reaching movements without vision of the limb. II. Evidence of fast unconscious processes correcting the trajectory of the hand to the final position of a double-step stimulus. Exp Brain Res. 1986;62(2):303–311. [Abstract] [Google Scholar]
  • Saunders J, Knill D. Visual feedback control of hand movements. J Neurosci. 2004;24(13):3223–3234. [Abstract] [Google Scholar]
  • Schaal S, Schweighofer N. Computational motor control in humans and robots. Curr Opin Neurobiol. 2005;15(6):675–682. [Abstract] [Google Scholar]
  • Scholz J, Schöner G. The uncontrolled manifold concept: Identifying control variables for a functional task. Exp Brain Res. 1999;126(3):289–306. [Abstract] [Google Scholar]
  • Scholz J, Schöner G, Latash M. Identifying the control structure of multijoint coordination during pistol shooting. Exp Brain Res. 2000;135(3):382–404. [Abstract] [Google Scholar]
  • Stein R, Gossen E, Jones K. Neuronal variability: Noise or part of the signal? Nat Rev Neurosci. 2005;6(5):389–397. [Abstract] [Google Scholar]
  • Todorov E. Optimality principles in sensorimotor controls. Nat Neurosci. 2004;7(9):907–915. [Europe PMC free article] [Abstract] [Google Scholar]
  • Todorov E. Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput. 2005;17(5):1084–1108. [Europe PMC free article] [Abstract] [Google Scholar]
  • Todorov E, Jordan M. Optimal feedback control as a theory of motor coordination. Nat Neurosci. 2002;5(11):1226–1235. [Abstract] [Google Scholar]
  • Todorov E, Li W. Proc American Control Conference. 2005. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems; pp. 300–306. [Google Scholar]
  • Torres E, Zipser D. Simultaneous control of hand displacements and rotations in orientation-matching experiments. J Appl Physiol. 2004;96(5):1978–1987. [Abstract] [Google Scholar]
  • Trommershäuser J, Gepshtein S, Maloney L, Landy M, Banks M. Optimal compensation for changes in task-relevant movement variability. J Neurosci. 2005;25(31):7169–7178. [Abstract] [Google Scholar]
  • van Beers R, Haggard P, Wolpert D. The role of execution noise in movement variability. J Neurophysiol. 2004;91(2):1050–1063. [Abstract] [Google Scholar]
  • van Beers R, Sittig A, Denier van der Gon J. Integration of proprioceptive and visual position-information: An experimentally supported model. J Neurophysiol. 1999;81(3):1355–1364. [Abstract] [Google Scholar]
  • Wolpert D, Ghahramani Z. Computational principles of movement neuroscience. Nat Neurosci Suppl. 2000;3:1212–1217. [Google Scholar]
  • Wolpert D, Ghahramani Z, Jordan M. An internal model for sensorimotor integration. Science. 1995;269:1880–1882. [Abstract] [Google Scholar]

Citations & impact 


Impact metrics

Jump to Citations

Citations of article over time

Smart citations by scite.ai
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by EuropePMC if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
Explore citation contexts and check if this article has been supported or disputed.
https://scite.ai/reports/10.1007/s10827-007-0041-y

Supporting
Mentioning
Contrasting
2
72
0

Article citations


Go to all (24) article citations