Content area
Full Text
Uncertainty is critical in the measure of information and in assessing the accuracy of predictions. It is determined by probability P, being maximal at P = 0.5 and decreasing at higher and lower probabilities. Using distinct stimuli to indicate the probability of reward, we found that the phasic activation of dopamine neurons varied monotonically across the full range of probabilities, supporting past claims that this response codes the discrepancy between predicted and actual reward. In contrast, a previously unobserved response covaried with uncertainty and consisted of a gradual increase in activity until the potential time of reward. The coding of uncertainty suggests a possible role for dopamine signals in attention-based learning and risk-taking behavior.
The brain continuously makes predictions and compares outcomes (or inputs) with those predictions (1-4). Predictions are fundamentally concerned with the probability that an event will occur within a specified time period. It is only through a rich representation of probabilities that an animal can infer the structure of its environment and form associations between correlated events (4-7). Substantial evidence indicates that dopamine neurons of the primate ventral midbrain code errors in the prediction of reward (8-10). In the simplified case in which reward magnitude and timing are held constant, prediction error is the discrepancy between the probability P with which reward is predicted and the actual outcome (reward or no reward). Thus, if dopamine neurons code reward prediction error, their activation after reward should decline monotonically as the probability of reward increases. However, in varying probability across its full range (P = 0 to 1), a fundamentally distinct parameter is introduced. Uncertainty is maximal at P = 0.5 but absent at the two extremes (P = 0 and 1) and is critical in assessing the accuracy of a prediction. We examined the influence of reward probability and uncertainty on the activity of primate dopamine neurons.
Two monkeys were conditioned in a Pavlovian procedure with distinct visual stimuli indicating the probability (P = 0, 0.25, 0.5, 0.75, and 1.0) of liquid reward being delivered after a 2-s delay (11). Anticipatory licking responses during the interval between stimulus and reward increased with the probability of reward (Fig. 1), indicating that the animals discriminated the stimuli behaviorally. However, at none of the intermediate...