Introduction
The challenge
Neuroscience has made great progress in decoding how cortical regions perform specific brain functions like primate vision (Kaas and Collins, 2003; Ballard and Zhang, 2021 and rodent navigation Chersi and Burgess, 2015; Moser et al., 2017). Conversely, the evolutionary much older motor control system still poses fundamental questions, despite a large body of experimental work. This is because, in mammals, in addition to areas in cortex like premotor and motor areas and to some degree sensory and parietal ones, many
Fully understanding motor control will thus entail understanding the simultaneous function and interplay of all brain regions involved. Little by little, new experimental techniques will allow us to monitor more neurons, in more regions, and for longer periods (Tanaka et al., 2018, e.g.). But to make sense of these data computational models must step up to the task of integrating all those regions to create a functional neuronal machine.
Finally, relatively little is known about the neural basis of motor development
With these challenges in mind we recently developed a motor control framework based on differential Hebbian learning (Verduzco-Flores et al., 2022). A common theme in physiology is the control of
Our working hypothesis (see Verduzco-Flores et al., 2022) is that control of homeostatic variables requires a feedback controller that uses the muscles to produce a desired set of sensory perceptions. The motosensory loop, minimally containing motor cortex, spinal cord, and sensory cortex may implement that feedback controller. To test this hypothesis we implemented a relatively complete model of the sensorimotor loop (Figure 1), using the learning rules in Verduzco-Flores et al., 2022 to produce 2D arm reaching. The activity of the neural populations and the movements they produced showed remarkable consistency with the experimental observations that we describe next.
Figure 1.
Main components of the model.
In the left panel, each box stands for a neural population, except for P, which represents the arm and the muscles. Arrows indicate static connections, open circles show
Relevant findings in motor control
Before describing our modeling approach, we summarize some of the relevant experimental data that will be important to understanding the results. We focus on three related issues: (1) the role of the spinal cord in movement, (2) the nature of representations in motor cortex, and (3) muscle synergies, and how the right pattern of muscle activity is produced.
For animals to move, spinal motoneurons must activate the skeletal muscles. In general, descending signals from the corticospinal tract do not activate the motoneurons directly, but instead provide input to a network of excitatory and inhibitory interneurons (Bizzi et al., 2000; Lemon, 2008; Arber, 2012; Asante and Martin, 2013; Alstermark and Isa, 2012; Jankowska, 2013; Wang et al., 2017; Ueno et al., 2018). Learning even simple behaviors involves long-term plasticity, both at the spinal cord (SC) circuit, and at higher regions of the motor hierarchy (Wolpaw et al., 1983; Grau, 2014; Meyer-Lohmann et al., 1986; Wolpaw, 1997; Norton and Wolpaw, 2018). Despite its obvious importance, there are comparatively few attempts to elucidate the nature of the SC computations, and the role of synaptic plasticity.
The role ascribed to SC is closely related to the role assumed from motor cortex, particularly M1. One classic result is that M1 pyramidal neurons of macaques activate preferentially when the hand is moving in a particular direction. When the preferred directions of a large population of neurons are added as vectors, a population vector appears, which points close to the hand’s direction of motion (Georgopoulos et al., 1982; Georgopoulos et al., 1986). This launched the hypothesis that M1 represents kinematic, or other high-level parameters of the movement, which are transformed into movements in concert with the SC. This hypothesis mainly competes with the view that M1 represents muscle forces. Much research has been devoted to this issue (Kakei et al., 1999; Truccolo et al., 2008; Kalaska, 2009; Georgopoulos and Stefanis, 2007; Harrison and Murphy, 2012; Tanaka, 2016; Morrow and Miller, 2003; Todorov, 2000, e.g.).
Another important observation is that the preferred directions of motor neurons cluster around one main axis. As shown in Scott et al., 2001, this suggests that M1 is mainly concerned with dynamical aspects of the movement, rather than representing its kinematics.
A related observation is that the preferred directions in M1 neurons experience random drifts that overlap learned changes (Rokni et al., 2007; Padoa-Schioppa et al., 2004). This leads to the hypothesis that M1 is a redundant network that is constantly using feedback error signals to capture the
A different perspective for studying motor cortex is to focus on how it can produce movements, rather than describing its activity (Shenoy et al., 2013). One specific proposal is that motor cortex has a collection of pattern generators, and specific movements can be created by combining their activity (Shenoy et al., 2013; Sussillo et al., 2015). Experimental support for this hypothesis came through the surprising finding of rotational dynamics in motor cortex activity (Churchland et al., 2012), suggesting that oscillators with different frequencies are used to produce desired patterns. This begs the question of how the animal chooses its desired patterns of motion.
Selecting a given pattern of muscle activation requires
The issue of coordinate transformation is central for motor control (Shadmehr and Wise, 2005; Schöner et al., 2018; Valero-Cuevas, 2009;
A promising candidate for motor primitives comes in the form of convergent force fields, which have been observed for the hindlimbs of frogs and rats (Giszter et al., 1993; Mussa-Ivaldi et al., 1994, or in the forelimbs of monkeys Yaron et al., 2020). In experiments where the limb is held at a particular location, local stimulation of the spinal cord will cause a force to the limb’s endpoint. The collection of these force vectors for all of the limb endpoint’s positions forms a force field, and these force fields have two important characteristics: (1) they have a unique fixed point and (2) simultaneous stimulation of two spinal cord locations produces a force field which is the sum of the force fields from stimulating the two locations independently. It is argued that movement planning may be done in terms of force fields, since they can produce movements that are resistant to perturbations, and also permit a solution to the problem of coordinate transformation with redundant actuators (Mussa–Ivaldi and Bizzi, 2000).
The neural origin of synergies, and whether they are used by the motor system is a matter of ongoing debate (Tresch and Jarc, 2009; de Rugy et al., 2013; Bizzi and Cheung, 2013). To us, it is of interest that single spinal units found in the mouse (Levine et al., 2014 and monkey Takei et al., 2017) spinal cord (sometimes called Motor Synergy Encoders, or MSEs) can reliably produce specific patterns of motoneuron activation.
Model concepts
We believe that it is impossible to understand the complex dynamical system in biological motor control without the help of computational modeling. Therefore, we set out to build a minimal model that could eventually control an autonomous agent, while still satisfying biological plausibility constraints.
Design principles and biological-plausibility constraints for neural network modeling have been proposed before (Pulvermüller et al., 2021; O’Reilly, 1998; Richards et al., 2019). Placing emphasis on the motor system, we compiled a set of characteristics that cover the majority of these constraints. Namely:
Spanning the whole sensorimotor loop.
Using only neural elements. Learning their connection strengths is part of the model.
Learning does not rely on a training dataset. It is instead done by synaptic elements using local information.
Learning arises from continuous-time interaction with a continuous-space environment.
There is a clear vision on how the model integrates with the rest of the brain in order to enact more general behavior.
Our aim is hierarchical control of homeostatic variables, with the spinal cord and motor cortex at the bottom of this hierarchy. At first glance, spinal plasticity poses a conundrum, because it changes the effect of corticospinal inputs. Cortex is playing a piano that keeps changing its tuning. A solution comes when we consider the corticospinal loop (e.g. the long-loop reflex) as a negative control system, where the spinal cord activates the effectors to reduce an error. The role of cortex is to produce perceptual variables that are controllable, and can eventually improve homeostatic regulation. In this regard, our model is a variation of Perceptual Control Theory (Powers, 1973; Powers, 2005), but if the desired value of the controller is viewed as a prediction, then this approach resembles active inference models (Adams et al., 2013). Either way, the goal of the system is to reduce the difference between the desired and the perceived value of some variable.
If cortex creates representations for perceptual variables, the sensorimotor loop must be configured so those variables can be controlled. This happens when the error in those variables activates the muscles in a way that brings the perceived value closer to the desired value. In other words, we must find the input-output structure of the feedback controller implicit in the long-loop reflex. We have found that this important problem can be solved by the differential Hebbian learning rules introduced in Verduzco-Flores et al., 2022. We favor the hypothesis that this learning takes place is in the connections from motor cortex to interneurons and brainstem. Nevertheless, we show that all our results are valid if learning happens in the connections from sensory to motor cortex.
In the Results section we will describe our model, its variations, and how it can learn to reach. Next we will show that many phenomena described above are present in this model. These phenomena emerge from having a properly configured neural feedback controller with a sufficient degree of biological realism. This means that even if the synaptic weights of the connections are set by hand and are static, the phenomena still emerge, as long as the system is configured to reduce errors. In short, we show that a wealth of phenomena in motor control can be explained simply by feedback control in the sensorimotor loop, and that this feedback control can be configured in a flexible manner by the learning rules presented in Verduzco-Flores et al., 2022.
Results
A neural architecture for motor control
The model in this paper contains the main elements of the long-loop reflex, applied to the control of a planar arm using six muscles. The left panel of Figure 1 shows the architecture of the model, which contains 74 firing rate neurons organized in six populations. This architecture resembles a feedback controller that makes the activity in a neural population approach the activity in a different population .
The six firing-rate neurons (called
represents a different cortical layer of the same somatosensory region as , where a ‘desired’ or ‘predicted’ activity has been caused by brain regions not represented in the model. Each firing rate neuron in has a corresponding unit in , and they represent the mean activity at different levels of the same microcolumn (Mountcastle, 1997). is a region (either in sensory or motor cortex) that conveys the difference between activities in and , which is the error signal to be minimized by negative feedback control.
Population represents sensory thalamus and dorsal parts of the spinal cord. It contains 18 units with logarithmic activation functions, each receiving an input from a muscle afferent. Each muscle provides proprioceptive feedback from models of the Ia, Ib, and II afferents. In rough terms, Ia afferents provide information about contraction velocity, and Ib afferents signal the amount of tension in the muscle and tendons.
Population represents motor cortex. Ascending inputs to arise from population , and use a variation of the
To represent positive and negative values, both and use a ‘dual representation’, where each error signal is represented by two units. Let be the error associated with the -th muscle. One of the two units representing
Plasticity mechanisms within the sensorimotor loop should specify which muscles contract in order to reduce an error signaled by . We suggest that this plasticity could take place in the spinal cord and/or motor cortex. To show that our learning mechanisms work regardless of where the learning takes place, we created two main configurations of the model. In the first configuration, called the ‘spinal learning’ model, a ‘spinal’ network transforms the outputs into muscle stimulation. learns to transform sensory errors into appropriate motor commands using a differential Hebbian learning rule (Verduzco-Flores et al., 2022). In this configuration, the error input to each unit comes from one of the activities. A second configuration, called the ‘cortical learning’ model, has ‘all-to-all’ connections from to using the differential Hebbian rule, whereas the connections from to use appropriately patterned static connections. Both configurations are basically the same model; the difference is that one configuration has our learning rule on the inputs to , whereas the other has it on the inputs to (Figure 1).
While analyzing our model we reproduced several experimental phenomena (described below). Interestingly, these phenomena did not arise because of the learning rules. To make this explicit, we created a third configuration of our model, called the ‘static network’. This configuration does not change the weight of any synaptic connection during the simulation. The initial weights were hand-set to approximate the optimal solution everywhere (see Methods). We will show that all emergent phenomena in the paper are also present in the static network.
We explain the idea behind the differential Hebbian rule as applied in the connections from to contains interneurons, whose activity vector we denote as . The input to each of these units is an dimensional vector . Each unit in has an output , where is a positive sigmoidal function. The inputs are assumed to be errors, and to reduce them we want (1)
Assuming a monotonic relation between the motor commands and the errors, relation 1 entails that errors will trigger an action to cancel them, with some caveats considered in Verduzco-Flores et al., 2022. Synaptic weights akin to Equation 1 can be obtained using a learning rule that extracts correlations between the derivatives of
is organized to capture the most basic motifs of spinal cord connectivity using a network where balance between excitation and inhibition is crucial (Berg et al., 2007; Berg et al., 2019; Goulding et al., 2014). Each one of six motoneurons stimulate one muscle, and is stimulated by one excitatory (), and one inhibitory () interneuron. and stimulate one another, resembling the classic Wilson-Cowan model (Cowan et al., 2016). The trios composed of , and neurons compose a group that controls the activation of one muscle, with and receiving convergent inputs from . This resembles the premotor network model in Petersen et al., 2014. () trios are connected to other trios following the agonist-antagonist motif that is common in the spinal cord (Pierrot-Deseilligny and Burke, 2005). This means that units project to the units of agonists, and to the units of antagonists (Figure 1, right panel). When the agonist/antagonist relation is not strongly defined, muscles can be ‘partial’aASaS agonists/antagonists, or unrelated.
Connections from to (the ‘short-loop reflex’) use the input correlation learning rule, analogous to the connections from to .
Direct connections from to alpha motoneurons are not necessary for the model to reach, but they were introduced in new versions because in higher primates these connections are present for distal joints (Lemon, 2008). Considering that bidirectional plasticity has been observed in corticomotoneural connections (Nishimura et al., 2013), we chose to endow them with the differential Hebbian rule of Verduzco-Flores et al., 2022.
Because timing is essential to support the conclusions of this paper, every connection has a transmission delay, and all firing rate neurons are modeled with ordinary differential equations.
All the results in this paper apply to the three configurations described above (spinal learning, cortical learning, and static network). To emphasize the robustness and potential of the learning mechanisms, in the Appendix we introduce two variations of the spinal learning model (in the
Since most results apply to all configurations, and since results could depend on the random initial weights, we report simulation results using three means and three standard deviations , with the understanding that these three value pairs correspond respectively to the spinal learning, motor learning, and static network models. The statistics come from 20 independent simulations with different initial conditions.
A reference section in the Appendix (the
For each configuration, a single simulation was used to produce all the representative plots in different sections of the paper.
The model can reach by matching perceived and desired sensory activity
Reaches are performed by specifying an pattern equal to the activity when the hand is at the target. The acquisition of these patterns is not in the scope of this paper (but see Verduzco-Flores et al., 2022).
We created a set of random targets by sampling uniformly from the space of joint angles. Using this to set a different pattern in every 40 s, we allowed the arm to move freely during 16 target presentations. To encourage exploratory movements we used noise and two additional units described in the Methods.
All model configurations were capable of reaching. To decide if reaching was learned in a trial we took the average distance between the hand and the target (the
The system learned to reach in 99 out of 100 trials (20 for each configuration). One simulation with the spinal learning model had an average error of 14 cm during the last 4 reaches of training. To assess the speed of learning we recorded the average number of target presentations required before the error became less than 10 cm for the first time. This average number of failed reaches before the first success was: .
Figure 2A shows the error through 16 successive reaches (640 s of in silico time) in a typical case for the spinal learning model. A supplementary video (Appendix 1—Video 1) shows the arm’s movements during this simulation. Figures similar to Figure 2 can be seen for all configurations as figure supplements (Figure 2—figure supplement 1) (Figure 2—figure supplement 2).
Figure 2.
Representative training phase of a simulation for the spinal learning model.
(A) Distance between the target and the hand through 640 s of simulation, corresponding to 16 reaches to different targets. The horizontal dotted line corresponds to 10 cm. The times when changes are indicated with a vertical, dotted yellow line. Notice that the horizontal time axis is the same for all panels of this figure. The average error can be seen to decrease through the first two reaches. (B) Desired versus actual hand coordinates through the training phase. The straight lines denote the desired X (green) and Y (red) coordinates of the hand. The noisy orange and blue lines show the actual coordinates of the hand. (C) Activity of units 0 and 1 in and . This panel shows that the desired values in the units (straight dotted lines) start to become tracked by the perceived values. (D) Activity of units 1, 2, and their duals. Notice that even when the error is close to zero the activity in the units does not disappear. E: Activity of the trio for muscle 0. The intrinsic noise in the units causes ongoing activity. Moreover, the inhibitory activity (orange line) dominates the excitatory activity (blue line).
Figure 2—figure supplement 1.
Representative training phase of the simulation for the cortical learning configuration.
Panels are as in Figure 2.
Figure 2—figure supplement 2.
Representative training phase of the simulation for the static network.
Panels are as in Figure 2.
Figure 2—figure supplement 3.
Representative training phase of the simulation for the synergistic network.
Panels are as in Figure 2.
Figure 2—figure supplement 4.
Representative training phase of the simulation for the mixed errors network.
Panels are as in Figure 2.
In Figure 2A, the error increases each time a new target was presented (yellow vertical lines), but as learning continues it was consistently reduced below 10 cm.
Panel B also shows the effect of learning, as the hand’s Cartesian coordinates eventually track the target coordinates whenever they change. This is also reflected as the activity in becoming similar to the activity in (panel C).
Panels D and E of Figure 2 show the activity of a few units in population and population during the 640 s of this training phase. During the first few reaches, shows a large imbalance between the activity of units and their duals, reflecting larger errors. Eventually these activities balance out, leading to a more homogeneous activity that may increase when a new target appears. M1 activation patterns that produce no movement are called the
In panel E, the noise in the units becomes evident. It can also be seen that inhibition dominates excitation (due to to connections), which promotes stability in the circuit.
We tested whether any of the novel elements in the model were superfluous. To this end, we removed each of the elements individually and checked if the model could still learn to reach. In conclusion, removing individual elements generally deteriorated performance, but the factor that proved essential for all configurations with plasticity was the differential Hebbian learning in the connections from to or from to . For details, see the the Appendix section titled
Center-out reaching 1: The reach trajectories present traits of cerebellar ataxia
In order to compare our model with experimental data, after the training phase we began a standard center-out reaching task. Switching to this task merely consisted of presenting the targets in a different way, but for the sake of smoother trajectories we removed the noise from the units in or .
Figure 3A shows the eight peripheral targets around a hand rest position. Before reaching a peripheral target, a reach to the center target was performed, so the whole experiment was a single continuous simulation controlled by the pattern.
Figure 3.
Center-out reaching.
(A) The arm at its resting position, with hand coordinates (0.3, 0.3) meters, where a center target is located. Eight peripheral targets (cyan dots) were located on a circle around the center target, with a 10 cm radius. The muscle lines, connecting the muscle insertion points, are shown in red. The shoulder is at the origin, whereas the elbow has coordinates (0.3, 0). Shoulder insertion points remain fixed. (B-F) Hand trajectories for all reaches in the three configurations. The trajectory’s color indicates the target. Dotted lines show individual reaches, whereas thick lines indicate the average of the 6 reaches.
Figure 3—figure supplement 1.
Center-out reaching before the control loop gain was adjusted.
Panels are as in Figure 3, with the two extra configurations presented in Appendix 1. The numbers in in panels B-F represent a particular configuration: Config. 1=spinal learning, Config. 2=cortical learning, Config. 3=static network, Config 4=synergistic network, Config. 5=mixed errors network.
Figure 3—figure supplement 2.
Training phase of the simulation for the static network before gain was reduced.
Panels are as in Figure 2.
Figure 3—figure supplement 3.
Training phase of the simulation for the synergistic network before gain was reduced.
Panels are as in Figure 2.
Peripheral targets were selected at random, each appearing six times. This produced 48 reaches (without counting reaches to the center), each one lasting 5 s. Panels B through D of Figure 3 show the trajectories followed by the hand in the three configurations. During these 48 reaches the average distance between the hand and the target was centimeters.
Currently our system has neither cerebellum nor visual information. Lacking a ‘healthy’ model to make quantitative comparisons, we analyzed and compared them to data from cerebellar patients.
For the sake of stability and simplicity, our system is configured to perform slow movements. Fast and slow reaches are different in cerebellar patients (Bastian et al., 1996). Slow reaches undershoot the target, follow longer hand paths, and show movement decomposition (joints appear to move one at a time). In Figure 3 the trajectories begin close to the 135 degree axis, indicating a slower response at the elbow joint. With the parameters used, the spinal learning and cortical learning models tend to undershoot the target, whereas in the static network the hand can oscillate around the target.
The traits of the trajectories can be affected by many hyperparameters in the model, but the dominant factor seems to be the gain in the control loop. Our model involves delays, activation latencies, momentum, and interaction torques. Unsurprisingly, increasing the gain leads to oscillations along with faster reaching. On the other hand, low gain leads to slow, stable reaching that often undershoots the target. Since we do not have a cerebellum to overcome this trade off, the gain was the only hyperparameter that was manually adjusted for all configurations (See Methods). In particular, we adjusted the slope of the and units so the system was stable, but close to the onset of oscillations. Gain was allowed to be a bit larger in the static network so oscillations could be observed. The figure supplements for Figure 3 shows more examples of configurations with higher gain (See
The shape of the trajectory also depends on the target. Different reach directions cause different interaction forces, and encounter different levels of viscoelastic resistance from the muscles.
Figure 4 reveals that the approach to the target is initially fast, but gradually slows down. Healthy subjects usually present a bell-shaped velocity profile, with some symmetry between acceleration and deceleration. This symmetry is lost with cerebellar ataxia (Becker et al., 1991; Gilman et al., 1976).
Figure 4.
Distance to target and reach velocity through time for the three configurations.
Thick lines show the average over 48 reaches (8 targets, 6 repetitions). Filled stripes show standard deviation. For the spinal and cortical learning configurations (left and center plots) the hand initially moves quickly to the target, but the direction is biased, so it needs to gradually correct the error from this initial fast approach; most of the variance in error and velocity appears when these corrections cause small-amplitude oscillations. In the case of the static network (right plots) oscillations are ongoing, leading to a large variance in velocity.
We are not aware of center-out reaching studies for cerebellar patients in the dark, but (Day et al., 1998) does examine reaching in these conditions. Summarizing its findings:
Movements were slow.
The endpoints had small variation, but they had constant errors.
Longer, more circuitous trajectories, with most changes in direction during the last quarter.
Trajectories to the same target showed variations.
From Figures 3 and 4 we can observe constant endpoint errors when the gain is low, in the spinal and cortical learning models. Circuitous trajectories with a pronounced turn around the end of the third quarter are also observed. Individual trajectories can present variations. A higher gain, as in the static network on the right plots, can increase these variations, as illustrated in the figure supplements for Appendix 1.
Center-out reaching 2: Directional tuning and preferred directions
To find whether directional tuning could arise during learning, we analyzed the population activity for the 48 radial reaches described in the previous subsection.
For each of the 12 units in , Figure 5A shows the mean firing rate of the unit when reaching each of the 8 targets. The red arrows show the Preferred Direction (PD) vectors that arise from these distributions of firing rates. For the sake of exposition, Figure 5 shows data for the simpler case of one-to-one connectivity between and in the spinal learning model, but these results generalize to the case when each unit receives a linear combination of the activities (the ‘mixed errors’ variation presented in the
Figure 5.
Directional tuning of the units in for a simulation with the spinal learning model.
(A) Average firing rate per target, and preferred direction (see Methods) for each of the 12 units in . Each polar plot corresponds to a single unit, and each of the 8 purple wedges corresponds to one of the 8 targets. The length of a wedge indicates the mean firing rate when the hand was reaching the corresponding target. The red arrow indicates the direction and relative magnitude of the PD vector. The black arrow shows the predicted PD vector, in this case just the corresponding arrows from panel B. (B) For each muscle and target, a wedge shows the muscle’s length at rest position minus the length at the target, divided by the rest position length. The red arrow comes from the sum of the wedges taken as vectors, and represents the muscle’s direction of maximum contraction. Plots corresponding to antagonist muscles are connected by red lines. (C) Average activity of the 6 units indicating muscle tension. The black arrows come from the sum of wedges taken as vectors, showing the relation between muscle tension and preferred direction.
We found that units were significantly tuned to reach direction (, bootstrap test), with PD vectors of various lengths. The direction of the PD vectors is not mysterious. Each unit controls the length error of one muscle. Figure 5B shows that the required contraction length depends on both the target and the muscle. The PD vectors of units 0–5 point to the targets that require the most contraction of their muscle. Units 6–11 are the duals of 0–5, and their PD is in the opposite direction. Figure 5C shows that the PD may also be inferred from the muscle activity, reflected as average tension.
In the case when each unit receives a linear combination of errors, its PD can be predicted using a linear combination of the ‘directions of maximum contraction’ shown in Figure 5B, using the same weights as the inputs. When accounting for the length of the PD vectors, this can predict the PD angle with a coefficient of determination .
As mentioned in the Introduction, the PDs of motor cortex neurons tend to align in particular directions Scott et al., 2001. This is almost trivially true for this model, since the PD vectors are mainly produced by linear combinations of the vectors in Figure 5B.
Figure 6 shows the PD for all the units in a representative simulation for each of the configurations. In every simulation, the PD distribution showed significant bimodality (). The main axis of the PD distribution (see Methods) was degrees.
Figure 6.
Preferred direction vectors for the 12 units.
In all three plots the arrows denote the direction and magnitude of the preferred direction (PD) for an individual unit. The gray dotted lines shows the main axis of the distribution. The red dotted lines are a 45 degree rotation of the gray line, for comparison with Scott et al., 2001. It can be seen that all configurations display a strong bimodality, especially when considering the units with a larger PD vector. The axis where the PD vectors tend to aggregate is in roughly the same position for the three configurations.
To compare with (Scott et al., 2001) we rotate this line 45 degrees so the targets are in the same position relative to the shoulder (e.g. Lillicrap and Scott, 2013 Figure 1, Kurtzer et al., 2006 Figure 1). This places the average main axes above in a range between 99 and 104 degrees, comparable to the 117 degrees in Scott et al., 2001.
The study in Lillicrap and Scott, 2013 suggested that a rudimentary spinal cord feedback system should be used to understand
The PD vectors are not stationary, but experience random fluctuations that become more pronounced in new environments (Rokni et al., 2007; Padoa-Schioppa et al., 2004). The brain is constantly remodeling itself, without losing the ability to perform its critical operations (Chambers and Rumpel, 2017). Our model is continuously learning, so we tested the change in the PDs by setting 40 additional center-out reaches (no intrinsic noise) after the previous experiment, once for each configuration.
To encourage changes we set 10 different targets instead of 8. After a single trial for each configuration the change in angle for the 12 PD vectors had means and standard deviations of degrees. Larger changes (around 7 degrees) could be observed in the ‘mixed errors’ variation of the model, presented in the Appendix (
The average distance between hand and target during the 40 reaches was cm, showing that the hand was still moving towards the targets, although with different errors due to their new locations.
Center-out reaching 3: Rotational dynamics
Using a dynamical systems perspective, (Shenoy et al., 2013) considers that the muscle activity (a vector function of time) arises from the cortical activity vector after it is transformed by the downstream circuitry:
(2)
It is considered that the mapping may consist of sophisticated controllers, but for the sake of simplicity this mapping is considered static, omitting spinal cord plasticity. The cortical activity arises from a dynamical system:
(3)
where represents inputs to motor cortex from other areas, and is a function that describes how the state of the system evolves.
A difficulty associated with Equation 3 is explaining how generates a desired muscle pattern when the function represents the dynamics of a recurrent neural network. One possibility is that M1 has intrinsic oscillators of various frequencies, and they combine their outputs to shape the desired pattern. This prompted the search for oscillatory activity in M1 while macaques performed center-out reaching motions. A brief oscillation (in the order of 200ms, or 5 Hz) was indeed found in the population activity (Churchland et al., 2012, and the model in Sussillo et al., 2015) was able to reproduce this result, although this was done in the open-loop version of Equations 2 and 3, where contains no afferent feedback (this is further commented in the Supplemental Discussion).
Recently it was shown that the oscillations in motor cortex can arise when considering the full sensorimotor loop, without the need of recurrent connections in motor cortex (Kalidindi et al., 2021). A natural question is whether our model can also reproduce the oscillations in Churchland et al., 2012 without requiring M1 oscillators or recurrent connections.
The analysis in Churchland et al., 2012 is centered around measuring the amount of rotation in the M1 population activity. The first step is to project the M1 activity vectors onto their first six principal components. These six components are then rotated so the evolution of the activity maximally resembles a pure rotation. These rotated components are called the ‘jPCA vectors’. The amount of variance in the M1 activity explained by the first two jPCA vectors is a measure of rotation. The Methods section provides more details of this procedure.
Considering that we have a low-dimensional, non-spiking, slow-reaching model, we can only expect to qualitatively replicate the essential result in Churchland et al., 2012, which is most of the variance being contained in the first jPCA plane.
We replicated the jPCA analysis, with adjustments to account for the smaller number of neurons, the slower dynamics, and the fact that there is no delay period before the reach (See Methods). The result can be observed in Figure 7, where 8 trajectories are seen in the plots. Each trajectory is the average activity of the 12 units when reaching to one of the 8 targets, projected onto the jPCA plane. The signature of a rotational structure in these plots is that most trajectories circulate in a counterclockwise direction. Quantitatively, the first jPCA plane (out of six) captures of the variance.
Figure 7.
Rotational dynamics in the M population in a representative simulation for all configurations.
Each plot shows the first two jPCA components during 0.25 s, for each of the 8 conditions/targets. Traces are colored according to the magnitude of their initial component, from smallest (green) to largest (red).
With this analysis we show that our model does not require intrinsic oscillations in motor cortex to produce rotational dynamics, in agreement with (Kalidindi et al., 2021 and DeWolf et al., 2016).
The effect of changing the mass
Physical properties of the arm can change, not only as the arm grows, but also when tools or new environments come into play. As a quick test of whether the properties in this paper are robust to moderate changes, we changed the mass of the arm and forearm from 1 to 0.8 kg and ran one simulation for each of the five configurations.
With a lighter arm the average errors during center-out reaching were cm. The hand trajectories with a reduced mass can be seen in the top 3 plots of Figure 8. We can observe that the spinal learning model slightly reduced its mean error, whereas the cortical learning model increased it. This can be understood by noticing that a reduction in mass is akin to an increase in gain. The spinal learning model with its original gain was below the threshold of oscillations at the endpoint, and a slight mass decrease did not change this. The cortical learning model with the original gain was already oscillating slightly, and an increase in gain increased the oscillations.
Figure 8.
Hand trajectories with low mass (0.8 kg, top 3 plots) and high mass (1.2 kg, bottom 3 plots) for the 3 configurations.
Plots are as in Figure 3. The spinal learning model and the static network show qualitatively similar trajectories compared to those in Figure 3. In contrast, the cortical learning model began to display considerable endpoint oscillations for several targets after its mass was reduced. These oscillations persist after the mass has been increased.
In the same simulation, after the center-out reaching was completed, we once more modified the mass of the arm and forearm, from 0.8 to 1.2 kg, after which we began the center-out reaching again. This time the center-out reaching errors were cm. The hand trajectories for this high mass condition are in the bottom 3 plots in Figure 8. It can be seen that the spinal learning and cortical learning models retained their respectively improved and decreased performance, whereas the static network performed roughly the same for all mass conditions. A tentative explanation is that with reduced mass the synaptic learning rules tried to compensate for faster movements with weights that effectively increased the gain in the loop. After the mass was increased these weights did not immediately revert, leading to similar trajectories after the increase in mass.
The results of the paper still held after our mass manipulations. For all configurations, PD vectors could be predicted with a coefficient of determination between.74 and.92; All units in were significantly tuned to direction; the main axis of the PD distribution ranged between 56 and 61 degrees, and the first jPCA plane captured between 33% and 58% of the variance.
Spinal stimulation produces convergent direction fields
Due to the viscoelastic properties of the muscles, the mechanical system without active muscle contraction will have a fixed point with lowest potential energy at the arm’s rest position. Limited amounts of muscle contraction shift the position of that fixed point. This led us to question whether this could produce convergent force fields, which as discussed before are candidate motor primitives, and have been found experimentally.
To simulate local stimulation of an isolated spinal cord we removed all neuronal populations except for those in , and applied inputs to the individual pairs of units projecting to the same motoneuron. Doing this for different starting positions of the hand, and recording its initial direction of motion, produces a
The first two panels of Figure 9 show the result of stimulating individual E-I pairs in , which will indeed produce direction fields with different fixed points.
Figure 9.
Two sample direction fields and their linear addition for circuit
(A) Direction Field (DF) from stimulation of the interneurons for muscle 0 (biarticular biceps). The approximate location of the fixed point is shown with a blue dot. (B) DF from stimulation of muscle 3 (biarticular triceps) interneurons. A red dot shows the fixed point. (C) Panels A and B overlapped. (D) In green, the DF from stimulating the interneurons for muscles 0 and 3 together. In purple, the sum of the DFs from panels A and B. Dots show the fixed points. The average angle between the green and purple vectors is 4 degrees.
We found that these direction fields add approximately linearly (Figure 9D). More precisely, let be the direction field from stimulating spinal locations and simultaneously, and be the angle of at hand coordinates . Using similar definitions for , we say the direction fields add linearly if .
We define the mean angle difference between and as
(4)
where is the number of sample points. We found that when averaged over the 15 (
Randomly choosing two possibly different pairs and for the stimulation locations leads to a mean angle difference of 37.6 degrees between the fields and . A bootstrap test showed that these angles are significantly larger () than in the previous case where .
The resting field is defined as the direction field when no units are stimulated. Removing the resting field from , and does not alter these results.
Recent macaque forelimb experiments (Yaron et al., 2020) show that the magnitude of the vectors in the fields is larger than expected from (supralinear summation). We found no evidence for this effect, suggesting that it depends on mechanisms beyond those present in our model.
Discussion
Summary of findings and predictions
We have presented a model of the long loop reflex with a main assumption: negative feedback configured with two differential Hebbian learning rules. One novel rule sets the loop’s input-output structure, and the other rule (input correlation) promotes stability. We showed that this model can make arm reaches by trying to perceive a given afferent pattern.
Our study made two main points:
Many experimental phenomena emerge from a feedback controller with minimally-complete musculoskeletal and neural models (emphasis is placed on the balance between excitation and inhibition).
Even if the feedback controller has multiple inputs and outputs, its input-output structure can be flexibly configured by a differential Hebbian learning rule, as long as errors are monotonic.
The first main point above was made using a feedback control network with no learning (called the static network in the Results). We showed that in this static network: (1) reaching trajectories are similar to models of cerebellar ataxia, (2) motor cortex units are tuned to preferred directions, (3) those preferred directions follow a bimodal distribution, (4) motor cortex units present rotational dynamics, (5) reaching is still possible when mass is altered, and (6) spinal stimulation produces convergent direction fields.
The second main point was made using two separate models, both using the same differential Hebbian learning rules, but applied at different locations. The spinal learning model presents the hypothesis that the spinal cord learns to adaptively configure the input-output structure of the feedback controller. The cortical learning model posits that configuring this structure could instead be a function of motor cortex; this would not disrupt our central claims. These two models should not be considered as incompatible hypotheses. Different elements performing overlapping functions are common in biological systems (Edelman and Gally, 2001).
Two variations of the spinal learning model in the Appendix show that this learning mechanism is quite flexible, opening the doors for certain types of synergies, and for more complex errors (that still maintian the constraint of monotonicity).
We list some properties of the model, and possible implications:
Basic arm reaching happens through negative feedback, trying to perceive a target value set in cortex. Learning the input-output structure of the feedback controller may require spinal cord plasticity.
Cerebellar patients should not be able to adapt to tasks that require fast reactions, as negative feedback alone cannot compensate for delays in the system (Sanguineti et al., 2003). On the other hand, they should be able to learn tasks that require remapping afferent inputs to movements. One example is Richter et al., 2004, where cerebellar patients learned to move in a novel dynamic environment, but their movements were less precise than those of controls.
The shape of reaches is dominated by mechanical and viscoelastic properties of the arm and muscles.
Unfamiliar viscous forces as in Richter et al., 2004 should predictably alter the trajectory (Figure 3) for cerebellar patients, who should not be able to adapt unless they move slowly and are explicitly compensating.
Preferred Directions (PDs) in motor cortex happen because muscles need to contract more when reaching in certain directions.
The PD distribution should align with the directions where the muscles need to contract to reduce the error. These directions depend on which error is encoding. If the error is not related to reaching (e.g. related to haptic feedback), a different PD distribution may arise after overtraining.
Drift in the PD vectors comes from the ongoing adaptation, and it should not disrupt performance.
The oscillations intrinsic to delayed feedback control after the onset of a target are sufficient to explain the quasi-oscillations observed in motor cortex (Churchland et al., 2012; Kalidindi et al., 2021).
Convergent force fields happen naturally in musculoskeletal systems when there is balance in the stimulation between agonists and antagonists. Linear addition of force fields is a result of forces/torques adding linearly.
Since our relatively simple model reproduces these phenomena, we believe it constitutes a good null hypothesis for them. But beyond explaining experimental observations, this model makes inroads into the hard problem of how the central nervous system (CNS) can generate effective control signals, recently dubbed the ‘supraspinal pattern formation’ problem (Bizzi and Ajemian, 2020). From our perspective, the CNS does not need to generate precise activation patterns for muscles and synergies; it needs to figure out which perceptions need to change. It is subcortical structures that learn the movement details. The key to make such a model work is the differential Hebbian learning framework in Verduzco-Flores et al., 2022, which handles the final credit assignment problem.
We chose not to include a model of the cerebellum at this stage. Our model reflects the brain structure of an infant baby who can make clumsy reaching movements. At birth the cerebellum is incomplete and presumably not functional. It requires structured input from spinal cord and cortex to establish correct synaptic connections during postnatal development and will contribute to smooth reaching movements at a later age.
Encompassing function, learning, and experimental phenomena in a single simple model is a promising start towards a more integrated computational neuroscience. We consider that such models have the potential to steer complex large-scale models so they can also achieve learning and functionality from scratch.
Methods
Simulations were run in the Draculab simulator (Verduzco-Flores and De Schutter, 2019). All the parameters from the equations in this paper are presented in the Appendix. Parameters not shown can be obtained from Python dictionaries in the source code. This code can be downloaded from: https://gitlab.com/sergio.verduzco/public_materials/-/tree/master/adaptive_plasticity.
Unit equations
With the exception of the and populations, the activity (5) (6)
where is a time constant, is the slope of the sigmoidal function, is its threshold, and is the sum of delayed inputs times their synaptic weights.
Units in the populations (in the spinal learning model) or in (in the cortical learning model) had an additional noise term, which turned Equation 5 into this Langevin equation:
(7)
where is a Wiener process with unit variance, and is a parameter to control the noise amplitude. This equation was solved using the Euler-Maruyama method. All other unit equations were integrated using the forward Euler method. The equations for the plant and the muscles were integrated with SciPy’s (https://scipy.org/) explicit Runge-Kutta 5(4) method.
Units in the population use a rectified logarithm activation function, leading to these dynamics for their activity:
(8)
where is a time constant, is the scaled sum of inputs, is a threshold, and is the "positive part" function.
Learning rules
The learning rule for the connections from to units in the spinal learning model was first described in Verduzco-Flores et al., 2022. It has an equation:
(9)
In this equation, represents the activity of the -th unit in at time , and is its second derivative. Angle brackets denote averages, so that , where is the number of units. is the derivative of the activity for the postsynaptic unit, and is a time delay ensuring that the rule captures the proper temporal causality. In the Supplementary Discussion of the Appendix we elaborate on how such a learning rule could be present in the spinal cord.
The learning rule in 9 was also fitted with soft weight-bounding to prevent connections from changing sign, and multiplicative normalization was used to control the magnitude of the weights by ensuring two requirements: (1) all weights from projections of the same unit should add to , (2) all weights ending at the same unit should add to . With this, the learning rule adopted the form:
(10)
In this equation is a constant learning rate, is the right-hand side expression of Equation 9, and is a scalar parameter. The value is divided by the sum of outgoing weights from the -th unit, and is divided by the sum of incoming weights on
The synapses in the connections from to and from to used the input correlation rule (Porr and Wörgötter, 2006):
(11)
where is the scaled sum of inputs from the population, is the learning rate, is the scaled sum of inputs from or , and is its derivative. Unlike the original input correlation rule, this rule uses soft weight bounding to avoid weights changing signs. Moreover, the sum of the weights was kept close to a value. In practice this meant dividing the each individual value by the sum of weights from -to- (or -to-) connections, and multiplying times at each update. In addition, weight clipping was used to keep individual weights below a value .
The learning rule in the cortical learning model was the same, but the presynaptic units were in , and the postsynaptic units in .
Exploratory mechanism
Without any additional mechanisms the model risked getting stuck in a fixed arm position before it could learn. We included two mechanisms to permit exploration in the system. We describe these two mechanisms as they were applied to the spinal learning model and its two variations. The description below also applies to the case of the cortical learning model, with the units (instead of the units) receiving the noise and extra connections.
The first exploratory mechanism consists of intrinsic noise in the and interneurons, which causes low-amplitude oscillations in the arm. We have observed that intrinsic oscillations in the units are also effective to allow learning (data not shown), but the option of intrinsic noise permits the use of simple sigmoidal units in , and contributes to the discussion regarding the role of noise in neural computation.
The second mechanism for exploration consists of an additional unit, called . This unit acted similarly to a leaky integrator of the total activity in , reflecting the total error. If the leaky integral of the activity crossed a threshold, then would send a signal to all the and units, causing adaptation. The adaptation consisted of an inhibitory current that grew depending on the accumulated previous activity.
To model this, and units received an extra input . When the input from the unit was larger than 0.8, and , the value of would be set to . This is the square of a low-passed filtered version of (12)
If the input from was smaller than 0.8, or became larger than 0.2, then would decay towards zero:
(13)
With this mechanism, if the arm got stuck then error would accumulate, leading to adaptation in the spinal interneurons. This would cause the most active interneurons to receive the most inhibition, shifting the ‘dominant’ activities, and producing larger amplitude exploratory oscillations.
When a new target is presented, must reset its own activity back to a low value. Given our requirement to fully implement the controller using neural elements, we needed a way to detect changes in . A unit denominated can detect these changes using synapses that react to the derivative of the activity in units. was connected to in order to reset its activity.
More precisely, when inputs from were larger than 0.1, the activity of had dynamics:
(14)
Otherwise it had these dynamics:
(15)
(16)
As before, is a sigmoidal function, and is the scaled sum of inputs other than . When is smaller than a threshold the value of actually decreases, as this error is deemed small enough. When the activity increases, but the rate of increase is modulated by a rate of increase , where is a low-pass filtered version of is a constant parameter.
was a standard sigmoidal unit receiving inputs from , with each synaptic weight obeying this equation:
(17)
where
Plant, muscles, afferents
The planar arm was modeled as a compound double pendulum, where both the arm and forearm were cylinders with 1 kg. of mass. No gravity was present, and a moderate amount of viscous friction was added at each joint (3 ). The derivation and validation of the double pendulum’s equations can be consulted in a Jupyter notebook included with Draculab’s source code (in the tests folder).
The muscles used a standard Hill-type model, as described in Shadmehr and Wise, 2005, Pg. 99. The muscle’s tension obeys:
(18)
where is the input, an input gain, the parallel elasticity constant, the series elasticity constant, is the damping constant for the parallel element, is the length of the muscle, and . In here, is the resting length of the series element, whereas is the resting length of the parallel element. All resting lengths were calculated from the steady state when the hand was located at coordinates (0.3, 0.3).
We created a model of the Ia and II afferents using simple structural elements. This model includes, for each muscle one dynamic nuclear bag fiber, and one static bag fiber. Both of these fibers use the same tension equation as the muscle, but with different parameters. For the static bag fiber:
(19)
The dynamic bag fiber uses the same equation, with the superscript replaced by . No inputs were applied to the static or dynamic bag fibers, so they were removed from these equations. The rest lengths of the static and dynamic bag fibers where those of their corresponding muscles times factors , respectively.
The Ia afferent output is proportional to a linear combination of the lengths for the serial elements in both dynamic and static bag fibers. The II output has two components, one proportional to the length of the serial element, and one approximately proportional to the length of the parallel element, both in the static bag fiber. In practice this was implemented through the following equations:
(20)
(21)
In here, and are gain factors. and are constants determining the fraction of and output that comes from the serial element.
The model of the Golgi tendon organ producing the Ib outputs was taken from Lin and Crago, 2002. First, a rectified tension was obtained as:
(22)
is a gain factor, (23)
Static connections
In all cases, the connections to used one-to-one connectivity with the units driven by the II afferents, whereas connections from to and used all-to-all projections from the units driven by the Ia and Ib afferents. Projections from to used one-to-one excitatory connections to the first 6 units, and inhibitory projections to the next six units. Projections from to used the opposite sign from this.
Connections from to were one-to-one, so the -th unit in only sent a projection to unit in . A variation of this connectivity is presented in the Appendix (See
We now explain how we adjusted the synaptic weights of the static network. To understand the projections from to and to the alpha motoneurons it is useful to remember that each trio is associated with one muscle, and the units also control the error of a single muscle. This error indicates that the muscle is longer than desired. Thus, the unit associated with muscle sent excitatory projections to the and units associated with muscle , and to the units of the antagonists of . Additionally, weaker projections were sent to the units of muscle ’s agonists. Notice that only excitatory connections were used.
The reverse logic was used to set the connections from to and . If muscle is tensing or elongating, this can predict an increase in the error for its antagonists, which is the kind of signal that the input correlation rule is meant to detect. Therefore, the afferent (signaling tension) of muscle sent an excitatory signal to the unit associated with muscle , and to the units associated with ’s antagonists. Moreover, this afferent also sent an excitatory projection to the dual of the unit associated with muscle . Connections from afferents (roughly signaling elongation speed) followed the same pattern, but with slightly smaller connection strengths.
Rotational dynamics
We explain the method to project the activity of onto the jPCA plane. For all units in we considered the activity during a 0.5 s sample beginning 50 ms after the target onset. Unlike (Churchland et al., 2012), we did not apply PCA preprocessing, since we only have 12 units in . Let be the activity at time of the unit in , when reaching at target for the -th repetition. By we denote the average over all repeated reaches to the same target, and by we indicate averaging over both targets and repetitions. The normalized average trace per condition is defined as: . Let stand for the number of units in , for the number of time points, and for the number of targets. Following (Churchland et al., 2012), we unroll the set of values into a matrix , so we may represent the data through a matrix that provides the least-squares solution to the problem . This solution comes from the equation . Furthermore, this matrix can be decomposed into symmetric and anti-symmetric components . The jPCA plane comes from the complex conjugate eigenvalues of .
In practice, our source code follows the detailed explanation provided in the Supplementary Information of Churchland et al., 2012, which reformulates this matrix problem as a vector problem.
Parameter search
We kept all parameter values in a range where they still made biological sense. Parameter values that were not constrained by biological data were adjusted using a genetic algorithm, and particle swarm optimization (PSO). We used a separate optimization run for each one of the configurations, consisting of roughly 30 iterations of the genetic and PSO algorithms, with populations sizes of 90 and 45 individuals respectively. After this we manually adjusted the gain of the control loop by increasing or decreasing the slope of the sigmoidal units in the and populations. This is further described in the Appendix (
The parameters used can affect the results in the paper. We chose parameters that minimized either the error during the second half of the learning phase, or the error during center-out reaching. Both of these measures are agnostic to the other results.
Preferred direction vectors
Next we describe how PD vectors were obtained for the units.
Let denote the firing rate of the -th unit when reaching for the -th target, averaged over 4 s, and across reaches to the same target. We created a function that mapped the X,Y coordinates of each target to its corresponding value, but in the domain of
Next we approximated
In order to predict the PD vectors, we first obtained for each muscle the ‘direction of maximum contraction’, verbally described in panel B of Figure 5. More formally, let denote the length of the -th muscle when the hand is at target , and let denote its length when the hand is at the center location. With we denote the unit vector with base at the center location, pointing in the direction of the -th target. The direction of maximum length change for the -th muscle comes from the following vector sum:
(24)
where .
For the -th unit in , its predicted PD vector comes from a linear combination of the vectors. Let the input to this unit be , where (25)
To obtain the main axis of the PD distribution, the -th PD vector was obtained in the polar form , with . We reflected vectors in the lower half using the rule: if otherwise. The angle of the main axis was the angle of the average PD vector using these modified angles: .
Statistical tests
To find whether units were significantly tuned to the reach direction we used a bootstrap procedure. For each unit we obtained the length of its PD vector 10,000 times when the identity of the target for each reach was randomly shuffled. We considered there was significant tuning when the length of the true PD vector was longer than 99.9% of these random samples.
To obtain the coefficient of determination for the predicted PD angles, let denote the angle of the true PD for the -th unit, and be the angle of its predicted PD. We obtained residuals for the angles as , where this difference is actually the angle of the smallest rotation that turns one angle into the other. Each residual was then scaled by the norm of its corresponding PD vector, to account for the fact that these were not homogeneous. Denoting these scaled residuals as the residual sum of squares is . The total sum of squares was: , where is the mean of the angles. The coefficient of determination comes from the usual formula .
To assess bimodality of the PD distribution we used a version of the Rayleigh statistic adapted to look for bimodal distributions where the two modes are oriented at 180 degrees from each other, introduced in Lillicrap and Scott, 2013. This test consists of finding an modified Rayleigh statistic defined as:
(26)
where the angles are the angles for the PDs. A bootstrap procedure is then used, where this statistic is produced 100,000 times by sampling from the uniform distribution on the interval. The PD distribution was deemed significantly bimodal if its value was larger than 99.9% of the random values.
We used a bootstrap test to find whether there was statistical significance to the linear addition of direction fields. To make this independent of the individual pair of locations stimulated, we obtained the direction fields for all 15 possible pairs of locations, and for each pair calculated the mean angle difference between and as described in the main text. We next obtained the mean of these 15 average angle deviations, to obtain a global average angle deviation .
We then repeated this procedure 400 times when the identities of the stimulation sites were shuffled, to obtain 400 global average angle deviations . We declared statistical significance if was smaller than 99% of the values.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022, Verduzco-Flores and De Schutter. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
How dynamic interactions between nervous system regions in mammals performs online motor control remains an unsolved problem. In this paper, we show that feedback control is a simple, yet powerful way to understand the neural dynamics of sensorimotor control. We make our case using a minimal model comprising spinal cord, sensory and motor cortex, coupled by long connections that are plastic. It succeeds in learning how to perform reaching movements of a planar arm with 6 muscles in several directions from scratch. The model satisfies biological plausibility constraints, like neural implementation, transmission delays, local synaptic learning and continuous online learning. Using differential Hebbian plasticity the model can go from motor babbling to reaching arbitrary targets in less than 10 min of in silico time. Moreover, independently of the learning mechanism, properly configured feedback control has many emergent properties: neural populations in motor cortex show directional tuning and oscillatory dynamics, the spinal cord creates convergent force fields that add linearly, and movements are ataxic (as in a motor system without a cerebellum).
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer