Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Nov 12;23(32):10402-10.
doi: 10.1523/JNEUROSCI.23-32-10402.2003.

Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm

Affiliations

Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm

Philippe N Tobler et al. J Neurosci. .

Abstract

Animals learn not only about stimuli that predict reward but also about those that signal the omission of an expected reward. We used a conditioned inhibition paradigm derived from animal learning theory to train a discrimination between a visual stimulus that predicted reward (conditioned excitor) and a second stimulus that predicted the omission of reward (conditioned inhibitor). Performing the discrimination required attention to both the conditioned excitor and the inhibitor; however, dopamine neurons showed very different responses to the two classes of stimuli. Conditioned inhibitors elicited considerable depressions in 48 of 69 neurons (median of 35% below baseline) and minor activations in 29 of 69 neurons (69% above baseline), whereas reward-predicting excitors induced pure activations in all 69 neurons tested (242% above baseline), thereby demonstrating that the neurons discriminated between conditioned stimuli predicting reward versus nonreward. The discriminative responses to stimuli with differential reward-predicting but common attentional functions indicate differential neural coding of reward prediction and attention. The neuronal responses appear to reflect reward prediction errors, thus suggesting an extension of the correspondence between learning theory and activity of single dopamine neurons to the prediction of nonreward.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Behavior in the Pavlovian conditioned inhibition paradigm. a, Licking in the six standard trial types after learning: A+, B- pretrained stimuli (licking before and after reward occurrence to A but not to B); AX-, BY- compound stimuli (no licking); X-, Y- test (no licking). Horizontal lines indicate periods of licking in each trial; consecutive trials are shown from top to bottom. All six trial types alternated semirandomly and were separated for display. In this example, average anticipatory licking duration per A+ trial was 596 msec, similar to the overall average of 534 msec shown in Figure 2a. b, Summation test, consisting of presenting established inhibitor X- in compound with separately trained reward predictor C+. Note absence of licking in CX- trials (behavioral inhibition). Occasional CX- and CY- test trials were interspersed with C+ trials. c, Learning retardation test. Acquisition of licking was slower in rewarded X+ trials than in rewarded Y+ trials. The figure shows examples from different animals. The number of licking phases after the reward varied between animals.
Figure 2.
Figure 2.
Acquisition of conditioned inhibition. Left, A+, AX-, and BY- trials. No licking occurred in B- trials (data not shown). Middle, CX- and CY- trials. Right, X+ and Y+ trials. The learning curves show licking during 1.5 sec stimulus duration. The unsmoothed data were averaged over four trials in each of the two animals.
Figure 3.
Figure 3.
Responses of dopamine neurons in the conditioned inhibition paradigm. a, Response of one dopamine neuron in the six trial types. Note depressions after AX- and X-, and lack of activation by conditioned inhibitor X-. b, A case of neuronal activation by stimulus X-, which is smaller compared with A and followed by depression. c, Averaged population histograms of all 69 neurons tested with stimulus X-. In a and b, dots denote neuronal impulses, referenced in time to onset of stimuli. Each line of dots shows one trial, the original sequence being from top to bottom in each panel. Histograms correspond to sums of raster dots. Bin width = 10 msec. For c, histograms from each neuron normalized for trial number were added, and the resulting sum was divided by the number of neurons. In a-c, all six trial types alternated semirandomly and were separated for display.
Figure 4.
Figure 4.
Positions of dopamine neurons tested with the conditioned inhibitor (+ denotes significantly depressed neurons). Cells from two animals are superimposed on coronal sections of one animal (reconstructions from cresyl violet-stained sections). SNpc, Substantia nigra pars compacta; SNpr, substantia nigra pars reticulata; Ant 8.0-10.0, levels anterior to the interaural stereotaxic line.
Figure 5.
Figure 5.
Extinction of reward-predicting stimulus removes small dopamine activation to conditioned inhibitor. a, Extinction training (top to bottom). Licking and neuronal responses to stimulus A- subsided gradually. Bin width = 10 msec. b, Average population responses in 23 dopamine neurons to A-, AX-, and X- were abolished after extinction of A. Responses to free reward were preserved in the same neurons (bottom). Trial types with conditioned stimuli alternated semirandomly and were separated for display. Control stimuli B-, BY-, and Y- continued to not elicit any responses (data not shown). c, Preserved conditioned behavioral inhibition after extinction of reward-predicting stimulus, as evidenced by adding the established conditioned inhibitor X- to the established reward-predicting stimulus C+ (summation test). Animals licked in initial CY- trials, indicating reward prediction but not yet conditioned inhibition (bottom).
Figure 6.
Figure 6.
Dopamine coding of prediction errors in conditioned inhibition paradigm. a, Transfer of dopamine depression during inhibitory conditioning in AX- trials from the habitual reward time (arrow) to the conditioned compound stimulus AX- (bottom). The activation to AX- was lower compared with A+. Horizontal lines in rasters indicate periods of licking. b, Summation test reveals that established inhibitor X- functions independently from its original partner of compound training. A pretrained reward-predicting stimulus C+ activated dopamine neurons (top). When X- was added to this pretrained reward predictor, dopamine neurons were depressed at the time of conditioned stimuli but not at the time of expected nonreward (middle). In initial CY- control trials (bottom), dopamine neurons were depressed at time of omitted reward (arrow). c, Surprising reward delivery in a learning retardation test reveals neural coding of stronger reward prediction error after conditioned inhibitor X compared with neutral stimulus Y. d, Absence of activation of dopamine neurons at usual time of reward in trials with inhibitory stimulus X- and neutral stimulus Y-. Bin width = 10 msec.
Figure 7.
Figure 7.
Schematic of observed results versus hypothetical attention coding by phasic dopamine responses in the conditioned inhibition paradigm. Stimulus A+ predicted reward and attracted attention. Compound presentation of stimuli A+ and X- led to reward omission and induced behavioral inhibition. Stimulus X- was the conditioned inhibitor that predicted reward omission and attracted attention. Thus stimuli A+ and X- had differential reward prediction but commonly attracted attention. With attentional coding they should be expected to elicit similar activating responses (middle), whereas with reward coding opposing responses were expected and observed (left). Dopamine neurons responded to stimulus A+, compound AX-, and stimulus X- in a manner compatible with reward coding (left) but not attentional coding (middle). Control stimuli B- and Y- were not associated with reward or specific attention and did not induce neuronal responses (right). Note that attentional theories of learning suggest that unidirectional but not bidirectional prediction errors drive learning through attention and would predict activations at the time of unexpected reward omission in AX- trials before learning.

Similar articles

Cited by

References

    1. Briand KA, Hening W, Poizner H, Sereno AB ( 2001) Automatic orienting of visuospatial attention in Parkinson's disease. Neuropsychologia 39: 1240-1249. - PubMed
    1. Brown RG, Marsden CD ( 1988) Internal versus external cues and the control of attention in Parkinson's disease. Brain 111: 323-345. - PubMed
    1. Carli M, Evenden JL, Robbins TW ( 1985) Depletion of unilateral striatal dopamine impairs initiation of contralateral actions and not sensory attention. Nature 313: 679-682. - PubMed
    1. Egeth HE, Yantis S ( 1997) Visual attention: control, representation, and time course. Annu Rev Psychol 48: 269-297. - PubMed
    1. Eisenberger R ( 1972) Explanation of rewards that do not reduce tissue needs. Psychol Bull 77: 319-339. - PubMed

Publication types

LinkOut - more resources