Content area
Full Text
PERSPECTIVES
Several recent studies have encourageda revision of our views on the nature of the phasic dopamine reward response. The studies demonstrate distinct subcomponents of the phasic dopamine response33, providean alternative explanation for activations in response to aversive stimuli2527 and document strong sensitivity to some unrewarded stimuli34. Here, I outline and evaluate theevidence for a more elaborate view of the phasic dopamine reward prediction-error signal, which evolves from an initial response that unselectively detects any potential reward (including stimuli that turn out to be aversive or neutral) to a subsequent main component that codes the by now well-identified reward value. Furthermore, I suggest that thereward prediction-error response shouldbe specifically considered to be a utility prediction error signal35.
Processing of reward components Reward components. Rewards consist of distinct sensory and value components
(FIG.1). Their neuronal processing takes time and engages sequential mechanisms, which becomes particularly evident when the rewards consist of more-complex objects and stimuli. Rewards first impinge on the body through their physical sensory impact. They draw attention through their physical salience, which facilitates initial detection. The specific identity of rewards derives from their physical parameters, such as size, form, colour and position, which engage subsequent sensory and cognitive processes. Comparison with known objects determines their novelty, which draws attention through novelty salience and surprise salience. During and after their identification, valuation takes place. Valueis the essential feature that distinguishes rewards from other objects and stimuli; it can be estimated from behavioural preferences that are elicited in choices. Value draws attention because it provides motivational salience. The various forms of salience physical, novelty, surprise and motivational induce stimulus-driven attention, which selects information and modulates neuronal processing3640. Thus, neuronal reward processing evolves in time from unselective sensory detection to the more demanding and crucial stages of identification and valuation. These processes lead to internal
O P I N I O N
Dopamine reward prediction-error signalling: a two-component response
Wolfram Schultz
Abstract | Environmental stimuli and objects, including rewards, are often processed sequentially in the brain. Recent work suggests that the phasic dopamine reward prediction-error response follows a similar sequential pattern. An initial brief, unselective and highly sensitive increase in activity unspecifically detects a wide range of environmental stimuli, then quickly evolves into the main response component,...