Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Oct 29;23(30):9913-23.
doi: 10.1523/JNEUROSCI.23-30-09913.2003.

Correlated coding of motivation and outcome of decision by dopamine neurons

Affiliations

Correlated coding of motivation and outcome of decision by dopamine neurons

Takemasa Satoh et al. J Neurosci. .

Abstract

We recorded the activity of midbrain dopamine neurons in an instrumental conditioning task in which monkeys made a series of behavioral decisions on the basis of distinct reward expectations. Dopamine neurons responded to the first visual cue that appeared in each trial [conditioned stimulus (CS)] through which monkeys initiated trial for decision while expecting trial-specific reward probability and volume. The magnitude of neuronal responses to the CS was approximately proportional to reward expectations but with considerable discrepancy. In contrast, CS responses appear to represent motivational properties, because their magnitude at trials with identical reward expectation had significant negative correlation with reaction times of the animal after the CS. Dopamine neurons also responded to reinforcers that occurred after behavioral decisions, and the responses precisely encoded positive and negative reward expectation errors (REEs). The gain of coding REEs by spike frequency increased during learning act-outcome contingencies through a few months of task training, whereas coding of motivational properties remained consistent during the learning. We found that the magnitude of CS responses was positively correlated with that of reinforcers, suggesting a modulation of the effectiveness of REEs as a teaching signal by motivation. For instance, rate of learning could be faster when animals are motivated, whereas it could be slower when less motivated, even at identical REEs. Therefore, the dual correlated coding of motivation and REEs suggested the involvement of the dopamine system, both in reinforcement in more elaborate ways than currently proposed and in motivational function in reward-based decision-making and learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Behavioral task, trial types, and percentage of correctness at each trial type. A, Illustration of sensorimotor events that appeared during a single trial (see details in Materials and Methods). B, Two epochs (trial-and-error and repetition epochs) and five trial types (N1, N2, N3, R1, R2) in a block of trials classified on the basis of correct and incorrect button choices. C, Correct choice rate over the 7 month study as a function of trial type in monkey DN. The results are expressed as means and SD of all trials during which all DA neuron activity was recorded.
Figure 3.
Figure 3.
Electrophysiological and histological identification of DA neurons. A, Left, Superimposed traces of extracellularly recorded action potentials of DA neurons (SNc) and non-DA neurons (SNr). The two vertical lines and the horizontal interrupted line indicate how the duration of the action potential was measured. Right, Histograms of the duration of recorded action potentials. B, Histological reconstruction of the recording sites of DA neurons (filled circles) and non-DA neurons (blue lines) along electrode tracks in and around the SNc. Stars indicate locations of electrolytic microlesion marks. Scale bar, 2 mm. C, A Nissl-stained section at the level of the SN is shown (scale bar, 1 mm) (top), and part (interrupted circle) of the neighboring TH-stained section is shown at higher magnification (scale bar, 100 μm) (bottom). White arrows, TH-immunoreactive neurons; M, part of a lesion mark.
Figure 2.
Figure 2.
Task performance in the partially learned and fully learned stages. A, Correct choice rate against the trial types in partially learned (1–36 d) and fully learned (37–215 d) stages in monkey DN. B, Same as A but in the partially learned (1–15 d) and fully learned (16–95 d) stages for monkey SK. C, Average RTs for the start LED at each trial type in monkey DN. Error bars indicate SD. D, Same as C but for monkey SK. E, Superimposed traces of orofacial muscle activity during three incorrect trial types (N1, N2, N3) (left) and average traces during five correct (N1, N2, N3, R1, R2) and three incorrect (N1, N2, N3) trial types (right) in monkey DN. BEEP indicates the onset of the beep sound after the animal's choices.
Figure 4.
Figure 4.
Response of DA neurons to the start LED (CS). A, Activity of a single DA neuron recorded in the SNc of monkey DN before and after the CS in the five trial types. Impulse discharges that occurred during the individual trial types are represented separately as rasters and histograms. The activity is centered at the onset of the CS (vertical interrupted line). The trials in the raster display were reordered on the basis of the time interval between onset of the CS and depression of the start button. The time point of the button press in each trial is marked on the raster. B, Population response histograms of 52 DA neurons to the CS in monkey DN. C, Average increase in the discharge rate of 52 DA neurons during the fixed time window indicated by the shaded areas in each histogram in B, relative to the discharge rate over the 500 msec period just preceding the onset of the CS. The results are shown as means ± SE in monkey DN. On the response histogram are superimposed curves of reward expectations, as a probability (open squares) and a product of probability and volume of reward (filled circles) (see Results for explanation). The scale of the reward expectation on the ordinate on the right side is for the product of probability and volume of reward. D, Same as C but for 56 DA neurons in monkey SK. The bin width of the histograms was 15 msec.
Figure 5.
Figure 5.
Relationship of response magnitudes of DA neurons to briskness of behavioral responses to the CS. A, Population response histograms of the 56 DA neurons in monkey SK to the CS during R2 trials. Histograms are separated on the basis of the trials with short, middle, and long RTs to CS. The number in parentheses indicates the number of trials involved in each histogram. B, Correlation of the magnitude of neural responses to the CS in 52 DA neurons in monkey DN to RTs to depress the start button after the CS. The correlations are plotted separately in N1, N2, R1, and R2 trials. The results of N3 trials are not plotted because of the very small number of trials. The trials were classified into five groups on the basis of the RTs, and the mean and SEM of DA neuron responses in these groups of trials are plotted. C, Same as B but for monkey SK on the basis of the RTs of trials during recording of 56 DA neurons. Because the RTs in monkey SK were shorter than those in monkey DN by an average of ∼80 msec, the ranges of RTs in three groups of trials in monkey SK were shifted to shorter RTs from those in monkey DN. D, Correlation of the magnitude of responses to the CS with RTs in each trial in monkey DN. The correlation analysis was performed on 854 trials from 27 neurons showing significant responses to the CS. E, Correlation between average CS responses of single neurons and average RTs in monkey SK (56 neurons).
Figure 6.
Figure 6.
Responses of DA neurons to reinforcers after the animal's choices at each task trial. A, Activity of a representative DA neuron at correct and incorrect choices in the five trial types. The displays are centered at the onset of the reinforcers (vertical interrupted lines). The trials in the raster display were reordered according to the time interval between the GO signal and onset of the reinforcers, and the time point of the GO signal in each trial is marked on the raster display. RELEASE indicates the time point at which the monkey released the start button to depress one of the target buttons. B, Population response histograms of 52 DA neurons in monkey DN during correct and incorrect choices in the five trial types. The number in parentheses indicates the number of trials used to obtain the population response. C, The histogram of responses in monkey DN. The responses are shown as mean and SEM (vertical bar above or below each column) of the increase (correct trials) or decrease (incorrect trials) in the discharge rate during fixed time windows indicated by the shaded area in each histogram in B, relative to the discharge rate during the 500 msec period just preceding the onset of the CS. On the response histogram are superimposed positive and negative REEs (filled circles) derived from product of probability and volume of reward at each trial type (see Materials and Methods). D, Same as C but for monkey SK. Because incorrect trials rarely occurred during the repetition epoch, the neuronal responses and REEs for R1 and R2 trials were either combined and plotted as a single-trial type in monkey DN or not shown in monkey SK.
Figure 7.
Figure 7.
Relationship of responses to the CS and response to high-tone beep, positive reinforcer. A, Scatter plot showing positive correlation between the response to the CS and response to positive reinforcer in N2 trials in monkey DN (r = 0.234; slope = 0.125). B, Same as A but for monkey SK (r = 0.524; slope = 0.551).
Figure 8.
Figure 8.
Responses of DA neurons at the early partially learned stage and the later fully learned stage. A, Scatter plot of the average responses of DA neurons (mean and SEM) and RTs to depress the start button after the CS. The plots were made for all trials independent of trial type. Trials were divided into five groups on the basis of the RTs. Regression lines are superimposed. B, Histograms of the responses of the DA neurons to the reinforcers after the animal's choices in the partially learned stage and fully learned stage in the five trial types. The values in the incorrect R1 and R2 trials are combined in monkey DN and are not plotted in monkey SK because of the very small number of trials. REEs (mean and SEM) are superimposed on the histograms. The response histograms and REEs are normalized to have the same value at the maximum REE.

Similar articles

Cited by

References

    1. Aosaki T, Graybiel AM, Kimura M ( 1994) Effect of the nigrostriatal dopamine system on acquired neural responses in the striatum of behaving monkeys. Science 265: 412–415. - PubMed
    1. Arnauld A, Nichole P ( 1982) The art of thinking: Port-Royal logic (Dickoff J, James P, translators). Indianapolis: Bobbs-Merrill.
    1. Barto AG ( 1995) Adaptive critics and the basal ganglia. In: Models of information processing in the basal ganglia (Houk JC, Davis JL, Beiser DG, eds), pp 215–232. Cambridge, MA: MIT.
    1. Bindra D ( 1978) How adaptive behavior is produced: a perceptualmotivation alternative to response reinforcement. Behav Brain Sci 1: 41–91.
    1. Bolles RC ( 1972) Reinforcement, expectancy, and learning. Psychol Rev 79: 394–409.

Publication types

LinkOut - more resources