Abstract
Comodulation masking release (CMR) enhances the detection of signals embedded in wideband, amplitude-modulated maskers. At least part of the CMR is attributable to across-frequency processing, however, the relative contribution of different stages in the auditory system to across-frequency processing is unknown. We have measured the responses of single units from one of the earliest stages in the ascending auditory pathway, the ventral cochlear nucleus, where across frequency processing may take place. A sinusoidally amplitude-modulated tone at the best frequency of each unit was used as a masker. A pure tone signal was added in the dips of the masker modulation (reference condition). Flanking components (FCs) were then added at frequencies remote from the unit best frequency. The FCs were pure tones amplitude modulated either in phase (comodulated) or out of phase (codeviant) with the on-frequency component. Psychophysically, this CMR paradigm reduces within-channel cues while producing an advantage of ∼10 dB for the comodulated condition in comparison with the reference condition. Some of the recorded units showed responses consistent with perceptual CMR. The addition of the comodulated FCs produced a strong reduction in the response to the masker modulation, making the signal more salient in the poststimulus time histograms. A decision statistic based on d′ showed that threshold was reached at lower signal levels for the comodulated condition than for reference or codeviant conditions. The neurons that exhibited such a behavior were mainly transient chopper or primary-like units. The results obtained from a subpopulation of transient chopper units are consistent with a possible circuit in the cochlear nucleus consisting of a wideband inhibitor contacting a narrowband cell. A computational model was used to confirm the feasibility of such a circuit.
Keywords: chopper unit, onset unit, lateral inhibition, cochlear nucleus, multipolar cell, wideband inhibitor
Comodulation masking release (CMR) enables the detection of an otherwise masked signal by the addition of coherently amplitude-modulated energy above and/or below the signal frequency (Hall et al., 1984) (for review, see Hall et al., 1995). For human listeners, CMR can occur when energy is added in frequency regions remote from the signal, thus exciting distinct tonotopic channels (Moore et al., 1990; Cohen, 1991). Such a combination of information across frequencies could be a powerful survival strategy in the natural world, where many environmental sounds contain coherent low-frequency amplitude modulations (Richards and Wiley, 1980; Klump, 1996; Nelken et al., 1999). A process akin to CMR may therefore prove beneficial to animals in detecting calls or discrete events in noisy backgrounds. In support of this idea, both starlings (Klump and Langemann,1995; Langemann and Klump, 2001) and gerbils (Klump et al. 2001) can exhibit a large behavioral CMR.
There are different hypotheses to explain the across-frequency component of CMR. The dip-listening hypothesis assumes that the off-frequency representation of the masker envelope cues the listeners as to when to “listen” to have a more favorable signal-to-noise ratio (Buus, 1985). Alternatively, an equalization-cancellation process could reveal the presence of the signal by subtraction of the envelope present in remote frequency channels from the masker channel (Buus, 1985). Some authors have also proposed that CMR relies on multiple cues (Hall and Grose, 1988; Fantini et al., 1993) and may involve high-level auditory grouping strategies (Grose and Hall, 1993).
The physiological substrate for CMR is unknown; however, several studies have looked at various aspects of the phenomenon. At the level of the auditory nerve, single fibers can demonstrate a release from masking when the masker envelope is strongly modulated (Mott et al., 1990). These results are similar to the psychophysical results ofCarlyon et al. (1989) who showed a large difference in signal detectability between modulated and unmodulated maskers; this effect, however, persisted for narrowband maskers whose energy fell within a critical band. Therefore, this was probably not an across-frequency CMR.
Using a single band of noise as a masker, recordings from single units in the cat's primary auditory cortex have shown a masking release when the noise band was broad and coherently amplitude-modulated (Nelken et al., 1999). In this study, the detection cue was the disruption of the envelope-following response of the neuron by the introduction of the signal. Although there is a similarity between modulated broadband noise and environmental sounds, it is not clear how much of the masking release is attributable to across-channel processing and how much is attributable to within-channel processing (Carlyon et al., 1989; Verhey et al., 1999). A masking release has also been observed from multiunit clusters in the forebrain of the starling when using discrete, narrow bands of noise as maskers (Nieder and Klump, 2001). They reported some clusters showing substantial CMR (up to 17 dB) although, intriguingly, the positioning of the flanking bands in the inhibitory sidebands of each recording site was not necessary for obtaining the effect.
In the present study, we have recorded the responses from single units at one of the earliest stages in the central auditory pathway in which across-frequency processing could occur, the ventral cochlear nucleus. The stimuli were chosen to reduce within-channel cues while still producing a CMR, in humans, of ∼10 dB (Grose and Hall, 1989; Moore et al., 1990; Delahaye, 1999). Single units classified as transient choppers, primary-like or low best frequency could show discharge patterns compatible with a CMR. Onset units were more likely to respond well to the modulation but poorly to the signal. A model of a simple neural circuit that could underlie such responses is shown to account for this data.
MATERIALS AND METHODS
Physiology. The data reported in this paper were recorded from pigmented guinea pigs weighing between 333 and 442 gm. Animals were anesthetized with urethane (1.5 gm/kg, i.p.), and supplementary analgesia was provided by either Operidine (1 mg/kg, i.m.) or Hypnorm (1 mg/kg, i.m.). All animals were given atropine sulfate (0.06 mg/kg, s.c.) as a premedication. Additional doses of urethane and the analgesic were given when required.
The surgical preparation and stimulus presentation took place in a sound-attenuated chamber (Industrial Acoustics Company). All animals were tracheotomized, and core temperature was maintained at 38°C with a heating blanket. After placement in the stereotaxic apparatus, a midline incision of the scalp was made, and the skin was retracted laterally. The temporalis muscle on the left-hand side of the skull was removed, and the bulla was exposed. The method of stereotaxic positioning follows that previously reported (Winter and Palmer, 1990a,b). No histological verification of recording position was undertaken, but for the following reasons we are confident that all the units reported in this paper were recorded from the ventral division of the cochlear nucleus: the stereotaxic coordinates were identical to those used in previous studies in the ventral and anteroventral cochlear nucleus (Winter and Palmer, 1990a,b, 1995), and electrode tracks sometimes coursed their way through the dorsal cochlear nucleus (DCN) before entering the ventral division. Although data were recorded from units in the DCN, as judged by their stereotaxic position and physiological response type (Stabler et al., 1996), we have excluded them from the present data set.
The compound action potential (CAP) was monitored with the use of a silver-coated wire placed on the round window of the cochlea. The signal was filtered and amplified (10,000×). The CAP threshold was determined visually (10 msec tone pip, 1 msec rise–fall time, 10 sec−1) for selected frequencies at intervals during the experiment. If thresholds had deteriorated by >10 dB and were not recoverable (for example, by removal of fluid from the bulla), the animal was killed by an anesthetic overdose of sodium pentobarbitol (given intraperitoneally).
Complex stimuli. The stimuli were similar to the ones used in psychophysical studies (Grose and Hall, 1989; Moore et al., 1990;Gralla, 1991; Delahaye, 1999). The on-frequency component (OFC) masker was a pure tone, 100% sinusoidally amplitude-modulated (SAM) at a rate of 10 Hz. The carrier frequency was chosen to be equal to the best frequency (BF) of each unit. Five modulation cycles were presented, giving a 500 msec total duration. The level of the OFC masker before modulation was set between 30 and 40 dB above the pure tone threshold of the unit. The signal consisted of three, successive 50 msec tone pips presented in the last three dips of the OFC modulation. The first OFC dip was left without a signal to facilitate the visual interpretation of the physiological data. The tone pips were added in phase to the OFC, thus always provoking an increase in amplitude. They had 20 msec, Cos2 rise–fall time The signal level was varied across a broad range. Signal level is reported here as a signal-to-component ratio (S/C), defined as the signal maximum amplitude over the amplitude of the OFC before modulation. Levels were varied from no signal to up to +20 dB S/C. The recordings involving only the signal and the OFC are referred to as the “reference” condition (Fig. 1,RF).
In the comodulated (CM) condition, FCs were added to the OFC plus signal compound. The FCs were SAM pure tones modulated in phase with the OFC, with the same level as the OFC. The number and frequency spacing of the FCs was chosen according to the unit BF. For medium BFs (between ∼0.6 and 2 kHz), three FCs above and three FCs below the OFC were used, as in the psychophysical studies (Delahaye, 1999). A linear spacing of 100 or 200 Hz was used between components. One or two gaps were left between the OFC and the first proximal FCs, i.e., the frequency distance between the OFC and the nearest FCs was respectively twice or three times the spacing between FCs (Fig. 1,CM). For lower best frequencies, the FCs below the signal frequency that would have had a frequency <100 Hz were omitted, and some were replaced by additional FCs above the OFC. For higher BFs, a logarithmic spacing between FCs was used to compensate for the broadening of peripheral auditory filters. The spacing was 0.25 octave, with the distance between the OFC and the proximal FCs equal to 0.5 octave (one gap).
In the third, codeviant (CD) condition, the number and position of FCs was identical to the comodulated condition, but they were amplitude-modulated 180° out of phase of the OFC (Fig. 1,CD). This condition yields higher psychophysical thresholds in humans than the reference condition (+10 dB), presumably because of across-channel masking if the spacing between bands is wide enough (Moore et al., 1990; Delahaye, 1999).
After digital-to-analog conversion, the stimuli were low-pass filtered at the Nyquist frequency (Stanford Research Systems SR640) and attenuated (Tucker Davis Technology PA4). The stimuli were equalized (phonics graphic equalizer, model EQ 3600; Apple Sound) to compensate for the speaker and coupler frequency response before being fed into a Rotel RB971 power amplifier and a programmable end attenuator (0–75 dB in 5 dB steps). The signal was presented over a speaker (Radio Shack tweeter assembled by Mike Ravicz, Massachusetts Institute of Technology, Cambridge, MA) mounted in a coupler designed for the ear of a guinea pig. The stimuli were acoustically monitored with a Bruel & Kjaer 4134 microphone attached to a calibrated 1 mm probe tube.
Analyses. Recordings were made using tungsten-in-glass microelectrodes (Merrill and Ainsworth, 1972). Electrodes were advanced by an electronic microdrive (650 W; David Kopf Instruments, Tujunga, CA ) through the intact cerebellum in the sagittal plane at an angle of 45°. A wideband noise stimulus was used to locate the surface of the cochlear nucleus and to search for single units.
After isolation of a single unit, estimates of BF and threshold were obtained using audiovisual criteria. The spontaneous discharge was measured over a 10 sec period. Single units were classified by their peristimulus time histogram shape in response to suprathreshold BF tone bursts, their interspike interval, and discharge regularity. We used the coefficient of variation (CV) of the discharge regularity, as defined by Young et al. (1988), to classify a unit as primary-like (CV > 0.5), sustained chopper (CV < 0.35), or transient chopper (CV > 0.35). To identify a unit as an onset unit we have used the classification scheme of Winter and Palmer (1995). PSTHs were generated in response to 250 short tone bursts (50 msec) at the BF of the unit. Rise–fall time was 1 msec (Cos2gate), and the repetition rate was 4 sec−1. Spikes were timed with 1 μsec resolution (TDT ET1), and typically sound levels of 20 and 40 or 50 dB suprathreshold were used.
Modeling. The computational model was assembled from existing modules that have been published and evaluated elsewhere (Meddis et al., 1990; Hewitt and Meddis, 1993). The input to the system is a time-varying waveform that represents the acoustic stimulus. This is processed by a bank of linear, gammatone, bandpass filters that represent the frequency-selective response of the basilar membrane. The filterbank consists of 10 channels equally spaced on a log scale covering an interval from two octaves below to one octave above BF. All filters <1 kHz have a bandwidth of 200 Hz, whereas those above have a bandwidth of BF/5. The filters were implemented as a fourth-order cascade of first-order gammatone filters evaluated as digital IIR filters.
The output of each filter is passed to a model of a single inner hair cell (IHC) and IHC-auditory nerve (AN) synaptic response representing all IHCs in that channel (Meddis et al., 1990). This produces a stream of values representing the probability of an action potential in any AN fiber innervating the hair cell. A random number generator is used to convert the probability to the number of fibers firing in that epoch. This AN activity is used as input to the computational neurons. Each channel feeds 20 different fibers to its target neurons.
Two populations of neurons were modeled. The first population consists of 50 neurons, each with a wide receptive field [wide band inhibitor (WBI)]. The second population consists of 50 neurons with a narrow receptive field [narrow band (NB)]. All neurons have the same BF that is equal to the target signal frequency. The NB neurons receive input only from AN fibers in the BF channel. The WBI neurons receive equally weighted input from all AN fibers in all 10 channels. This is consistent with the narrow and broad receptive fields observed in the guinea pig for chop-T or onset units, respectively, as published elsewhere (Winter and Palmer, 1990). Each AN spike is represented as a current pulse one epoch (1/10,000 sec) in width. The pulses are low-pass filtered (first order IIR filter) to simulate dendritic effects. The time constant of the NB unit is set to 5 msec, and that of the WBI unit set to 1 msec. The height of the current pulse is 3 nA for inputs to the NB unit and 0.3 nA to the WBI unit. The NB neurons also receive inhibitory input from the WBI neurons: WBI unit spikes contribute a −1 nA current pulse to the operation of NB units. A 2 msec synaptic delay is introduced in the NB–WBI pathway. The individual neurons are modeled using point neurons (MacGregor, 1987) whose parameters are given in Table1.
Table 1.
Symbol | NB | WBI | |
---|---|---|---|
Resting potential (mV) | E0 | −60 | −60 |
Membrane time constant (msec) | τm | 2 | 1 |
Membrane resistance (MΩ) | Ri | 33 | 33 |
Potassium equilibrium (mV) | Ek | −10 | −10 |
Potassium boost (nS) | B | 20 | 40 |
Potassium time constant (msec) | τGk | 2.5 | 1 |
Threshold resting (mV) | Th0 | 5.3 | 10 |
Threshold boost (mV) | C | 0 | 10 |
Threshold time constant (msec) | τTh | 20 | 11 |
The model was implemented as a Visual Basic for Applications program attached to a Microsoft Excel spreadsheet. It was evaluated at a rate of 10 kHz. Stimuli were chosen to replicate the conditions used in the experiment for unit 250010, shown in Figures2 and 6a.
RESULTS
Physiological responses of single units
The response of a transient chopper (chop-T) unit to the three stimulus conditions is shown in Figure 2. This unit was chosen because it displays many characteristics that are consistent with a physiological CMR. The BF of this unit was 1.1 kHz. The flanking components were set at 0.3, 0.5, 0.7, 1.5, 1.7, and 1.9 kHz for the CM and CD conditions (200 Hz spacing, one gap). The temporal position of the signal is indicated by the dotted lines on each plot. The number of spikes elicited by each stimulus condition is indicated by the number in the top left corner of each plot. The signal-to-component ratio is indicated on the right-hand side of the figure. When the signal is absent (bottom row), there is a clear representation of the on-frequency modulated masker in the reference condition (RF, 2059 spikes). In the CM condition, there are considerably fewer spikes (1279), although the modulation is more pronounced in the raw waveform (Fig. 1). In the CD condition, the number of spikes elicited by the on-frequency masker is intermediate between the RF and CM conditions. These are common findings in units that show a CMR (see below). When the signal is added in the RF condition, the gaps in the poststimulus time histogram begin to fill-in with increasing signal level until there is little or no modulation remaining in the response at a +10 dB S/C. This is in contrast to the response in the CM condition in which the presence of the signal in the PSTH starts to dominate the response at low signal-to-component levels. Immediately after the response to the signal a reduction in the response to the modulation is also present in the PSTH at high signal levels. The response to the signal is almost completely absent in the CD condition, up to the highest signal level.
A similar response can be observed in Figure3 for a low-BF unit. The BF was 0.2 kHz, and this precluded the classification of this unit into the chopper or primary-like class. For this unit, the flanking components were all positioned above the BF at 0.6, 0.8, 1.0, 1.2, and 1.4 kHz. The reduction of the response to the modulation in the CM condition is even more pronounced than in the previous example.
A completely different type of response is seen in Figure4, which shows the output of a unit classified as an onset with a BF of 0.8 kHz. The flanking components were positioned at 0.4, 0.5, 0.6, 1.0, 1.1, and 1.2 kHz. There were few spikes elicited in response to the RF condition when the signal was absent. In contrast to the previous two units, the addition of the flanking components in the CM condition increased the response to the OFC masker modulation. An increase in response of a similar magnitude is seen in the CD condition because of the anti-phasic modulation of FCs. Only at the highest signal level is there any indication of a response to the signal.
Statistical analyses
In this section we introduce a quantitative method of analyzing the PSTHs shown in Figures 2-4. The method is not intended to put forward hypotheses about the processing that takes place at higher stages of the auditory pathways, but rather to describe the information present in the discharge rates at the level of the ventral cochlear nucleus (VCN). Psychophysically, CMR is measured by a detection task in which a no-signal interval and a given signal-to-component interval are compared within each condition separately (RF, CM, or CD). Accordingly, signal detection theory was used to estimate the detectability of the signal from the physiological PSTHs. Each PSTH was divided into 20 msec bins and a mean and SD of the number of spikes falling within each bin calculated. The bins represents successive, independent looks at the signal. For each bin, d′ was calculated between the no-signal condition and the signal-to-component condition using Equation 1. The formula takes into account the fact that the variances between bins could be unequal (Macmillan and Creelman, 1991).
Equation 1 |
with i the bin number, NS the number of spikes in the no-signal interval, S the number of spikes in the signal interval. An illustration of Equation 1 applied to the data of Figure 2 is shown in Figure 5. Large values of d′ are located where the response to the signal is greatest. To produce a single measure of detectability for each signal–no-signal pair, we then calculate the cumulative d′, which is defined in Equation 2. The cumulative d′ represents optimal combination of all the independent looks.
Equation 2 |
This analysis method is similar to the one used by Mott et al. (1990) to estimate thresholds from auditory nerve recordings, except that they constrained the observation looks to be centered on the signal. The two methods would actually give essentially the same results (Fig. 5), but the method chosen here does not require a priori knowledge about the temporal position of the signal.
The results of this analysis are shown in Figure6 for the three units shown in Figures2-4. It can be seen in Figure 6A (chop-T unit) that the cumulative d′ is greater for the CM stimulus than it is for the RF or CD stimuli at S/C ratios above −5 dB. Alternatively, a particular d′ would be reached at lower signal-to-component ratios for the CM condition than the RF or CD conditions. Becaused′ represents signal detectability, this unit can be said to exhibit a physiological CMR. Note that the number of levels in this figure is greater than that shown in Figure 2. The reduced number of levels shown in Figure 2 was for clarity only.
A similar result is shown for the low-BF unit in Figure6B. At all signal-to-component ratios the response to the CM condition is greater than the response to the other conditions. Again this unit could be exhibiting a CMR. In contrast, the response of the onset unit shown in Figure 6C shows that the detectability of the signal in the RF condition is greater than in the CM condition.
Population analyses
The d′ analysis was performed for all (n = 60) units for which a complete set of results was available. The presence of a CMR can be defined as a lower signal level in the CM condition compared with the RF condition, to reach a givend′ value that would correspond to threshold. This estimate has to be indirect with the present data because we used a constant stimulus method (sampling of fixed S/C levels) and not an adaptive procedure. Also, because of the variety in unit types, the individual units are not homogeneous in the range of d′ values they exhibit. The threshold difference was thus estimated by computing the level required for the CM condition to reach the d′ obtained at 0 dB S/C, in the RF condition (linear interpolation between data points). Some units had to be discarded from the analysis (see Table 3) because the target d′ value was not intercepted in the CM condition. Results are presented in Table2, broken across unit types. Chop-T units display a consistent CMR (median and interquartile above 0 dB); note, however, that not all chop-T units produced a CMR. Onset units consistently fail to show a CMR. The spread is larger for primary-like and low-BF units, with a small tendency to show positive CMR. A sign test of the median was performed to estimate whether the CMR values as measured by this method were significantly different from zero. Using a significance level of p < 0.05, only chop-T unit reach significance (p < 0.039). The whole population just fails to show CMR (p < 0.070).
Table 3.
Unit type | Primary-like | Chop-T | Onset | Low-BF | Others |
---|---|---|---|---|---|
Total | 22 | 13 | 9 | 14 | 2 |
CMR | 9 | 7 | 1 | 4 | 0 |
Percentage | 41% | 54% | 11% | 29% | 0% |
Criterion for CMR: d′(CM) >d′(RF) > d′(CD) at 0 dB S/N and −10 dB S/N.
Table 2.
Unit type | Primary-like | Chop-T | Onset | Low-BF | All |
---|---|---|---|---|---|
Total | 17 | 10 | 7 | 12 | 49 |
Median (dB) | +2.4 | +3.2 | −2.3 | +1.2 | +1.4 |
Interquartile (dB) | [−1.5, +8.2] | [+0.5, +5.7] | [−2.9, −0.4] | [−2.3, +4.8] | [−2, +5.5] |
Another method to define CMR is as a detection advantage of the CM condition over the RF condition and as a detection impairment for the CD condition over the RF condition. A comparison of signal detectability at 0 dB S/C is presented in Figure7, where the d′ of the CM and CD conditions are plotted relative to the d′ in the RF condition. Taken as a whole, the population of units shows a detection impairment for the CD condition. No clear trend is visible for the CM condition, which indicates that not all units in the VCN display a CMR-like behavior. When broken across unit types, the analysis closely parallels the results found in Table 2: chop-T show a detection advantage, onset show a detection impairment, and only a small trend is present for the other classes of units. A sign test of the median was performed for this measure and again, only chop-T reach significance for true CMR (CM–RF; p < 0.023). Note, however, that all units except those classified as onset show a highly significant masking release between the codeviant and comodulated cases (CM–CD; ≤p < 0.002). Onsets do not show such a masking release (CM–CD; p < 0.18), but our total population of units, taken together, do show a significant effect (p < 0.001). Such a CM–CD masking release has also been observed by Nieder and Klump (2001) in the auditory forebrain of the starling. However, they did not observe the across-frequency CM–RF masking release as demonstrated in this study.
To further summarize the results, a unit was said to exhibit CMR at a given signal level if (1) the d′ for the CM condition was higher than that for the RF condition and (2) the d′for the RF condition was higher than that for the CD condition. We computed the number of units that passed the d′ conditions for both the −10 dB S/C and 0 dB S/C levels (four tests overall). Note that the unit shown in Figure 2 failed this last, conservative test, although we consider it to display a CMR-like behavior, for the reasons explained above. A summary of the analysis is provided in Table3. Chop-T units are the most likely to show CMR, followed by primary-likes and low-BFs. Onset units very rarely exhibit CMR. All but one of the units that exhibited CMR, as measured by this latter analysis, also showed at least a 10% decrease in spike count when the FCs were added (RF to CM comparison).
As the stimuli were changed to accommodate the BF of each unit, a summary of the spectral properties of the stimuli is shown. The frequency distance between the flanking components on either side of the signal was compared with the width of the auditory filter at the signal frequency, for each individual data point. Auditory filter width was estimated according to the equivalent rectangular bandwidth (ERB) provided by Evans (2001) and corresponds to the equationERB(CF) = 0.29 *CF0.56, where CF is in kilohertz. The quality factor Q10 dB was also estimated by the relationship Q10 dB(CF) = 1.8 *ERB(CF). As can be seen from Figure8, all experiments were performed with a spectral gap larger than the auditory filter ERB. Most units that show a CMR according to Table 3 (solid symbols) were actually responding to stimuli with a gap greater than the auditory filter Q10 dB.
Hypothesized neural circuit
In this section of the results we propose a simple circuit within the VCN that is sufficient to encapsulate many of the observations that we have made regarding CMR. This circuit consists of two neuron types within the cochlear nucleus: a wideband inhibitor and a narrowband unit. The circuit is shown schematically in Figure9. Both cell types receive excitatory input from type I auditory nerve fibers, the main difference between the unit types being the wide frequency range over which the wideband inhibitor is able to sum inputs. In contrast the narrowband unit only receives input around its BF (1.1 kHz). The wideband inhibitor then synapses with the narrowband unit.
Such a circuit qualitatively explains the shape of the PSTHs observed in response to CMR stimuli. The wideband unit mainly responds to the modulation and increases its discharge rate when the FCs are added because they fall within its receptive field (Fig. 4). It provides fast-acting, short-duration inhibition to the narrow band unit, thus reducing the response to the modulation in the CM condition (Fig. 2). In the CD condition, the maximal inhibition coincides with the signal and thus suppresses its representation up to high signal-to-component ratio.
The circuit has been implemented as a computational neural model to quantitatively evaluate its predictions (see Materials and Methods for details). The results of the modeled narrow band unit in response to the same stimuli as used in the physiological recordings are shown in Figure 10. The format of Figure 10 is the same as that for Figure 2. The similarities between the model output and the response of the chop-T unit in Figure 2 are clear. In the CM condition (middle column) the response to the modulation is reduced, and the presence of the signal at high signal-to-component ratios is apparent in the PSTH. In both the model results and the experimental results the CD condition does not give a good representation of the signal in the PSTH. A d′ analysis has been performed on the simulated spike trains using the same method as for the physiological data. It is presented in Figure11A. The simulatedd′ reproduces the main features observed in the experimental data (Fig. 6A). Signal detectability is better in the CM condition, followed by RF and CD. The properties of the receptive fields of the neurons in the model were critical to the effect. When applied to the wideband inhibitor (Fig. 11B), thed′ analysis displayed an anti-CMR behavior, consistent with the onset response pattern (Fig. 6C). One way to estimate the influence of within channel effects on the d′ analysis method is to disconnect the inhibitory pathway in the model. In this case (Fig. 11C), the response to CM and RF were very similar, and no CMR was observed.
DISCUSSION
We have recorded responses of single units in the ventral cochlear nucleus of the anesthetized guinea pig to look for physiological correlates of comodulation masking release. Using a stimulus paradigm that is similar to several human psychophysical studies, we have shown that some single units classified as chop-T, primary-like, or low-BF may respond less to an on-frequency, modulated masker if comodulated flanking components are added in remote frequency regions. This demonstrates that across-frequency processing is already apparent at the level of the VCN. Signal detectability, as estimated by ad′ analysis, is improved in the comodulated case for some of these units. They may thus be said to exhibit a physiological CMR. Most units classified as onset failed to exhibit a CMR (eight of nine), however, they do show across-frequency processing in the sense that they display enhanced responses to broadband modulation. Analysis across the whole population of units from which we recorded do not show an average CMR, but this is in keeping with the variety of cell types found in the VCN (Lorente de Nó, 1981) and with the distinct signal processing roles hypothesized for distinct subpopulations of units.
Using a computational model, we have demonstrated that a simple neural circuit consisting of the inhibition of a narrowband unit by a wideband inhibitor was able to replicate many of our findings. The anatomical basis of the model is supported by the observation of Ferragamo et al. (1998), who found that stellate-D cells provide inhibitory input to stellate-T cells in brain slices of the mouse cochlear nucleus. Additional support for this hypothesis comes from labeling of an onset unit in the guinea pig cochlear nucleus that was shown to have extensive axonal arborizations throughout the ventral and dorsal cochlear nucleus (Arnott et al., 2001). It has been argued that the stellate-D cells in the mouse cochlear nucleus correspond to giant multipolar cells, as recorded in the cat (Oertel et al., 1990). Like the giant multipolar cells, stellate-D cells have a dorsally projecting axon and are thought to be inhibitory (Smith and Rhode, 1989). Previous studies have implicated stellate-D units with wideband inhibitors, and several authors have suggested that these cells may play a role in shaping the responses of type IV cells in the dorsal cochlear nucleus (Nelken and Young, 1994; Winter and Palmer, 1995). If stellate-D cells and giant multipolar cells are indeed one and the same, then one would expect them to give an onset-chopper (On-C) type of PSTH (Smith and Rhode, 1989), however, it is currently unresolved as to whether the onset-chopper response is the only response type from these cells. Several authors have failed to draw a clear distinction between On-C and onset with a low level of sustained activity (ON-L) response types (Godfrey et al., 1975; Jiang et al., 1996; Evans and Zhao, 1998), and it is possible that the On-C and On-L response types are in fact a continuum of response, both from the giant multipolar cell type.
Stellate-T cells correspond to multipolar cells in the VCN (Oertel et al., 1990) and both sustained chopper (chop-S) and transient chopper PSTH types have been associated with this response type (Rhode et al., 1983; Smith and Rhode, 1989; Smith et al., 1993). We have not recorded from any units classified as chop-S in this study; partly because we were deliberately sampling from the rostral AVCN where chop-T units are more prevalent (at least in the guinea pig; I. M. Winter, unpublished observation). However, chop-T units are often characterized by non-monotonic input–output functions and thus more likely to receive inhibitory input (Blackburn and Sachs, 1990,1992; Winter and Palmer, 1990a). In this study we hypothesize that this inhibition, provided by wideband units, is involved in CMR. The appearance of non-monotonic input–output functions in chop-S units is less prevalent, and these units are often characterized by sigmoidally saturating input–output functions (Blackburn and Sachs, 1989, 1990;Winter and Palmer, 1990a).
There are other possible interpretations of the results presented in this paper. The reduction of the response to the modulation may have been the result of two-tone suppression at the level of the basilar membrane. In psychophysical studies, this explanation has been described as unlikely because of the symmetry of the CMR effect (Hall et al., 1984). Indeed, for several units we compared the addition of flanking components above or below BF and observed little difference between the two conditions, however, we feel it is premature at present to dismiss completely a role for two-tone suppression.
An additional factor in the CMR effect could be a release from forward masking. It has been suggested that the increased recovery from previous stimulation that is observed for many unit types in the VCN is attributable to the recurrent inhibition between the superior olivary complex and the cochlear nucleus (Shore et al., 1991). If the recurrent inhibition was itself inhibited by a broadband unit responding to the modulation, then a release from masking could be observed (Delahaye, 1999). McFadden and Wright (1987) have reported a perceptual CMR-like effect in a forward masking situation. This explanation could be more appropriate for the responses observed from primary-like units, where inhibition from a wideband inhibitor has yet to be demonstrated. Note that in the guinea pig, Winter and Palmer (1990) reported that as many as 25% of prepotential for primary-like units were characterized with inhibition.
Comparison with human psychophysics
The physiological CMR, as estimated by the d′ analysis, is in broad agreement with psychophysical data obtained with similar stimuli (Moore et al., 1990; Delahaye, 1999). The CM advantage is observed at signal-to-component levels corresponding to the psychophysical signal threshold (−15 dB S/C for the RF condition, for an OFC at 50 dB SL) (Delahaye, 1999). However, we have not attempted to make a quantitative correlation between our results and the perceptual ones for several reasons. First, our data were obtained by repeated measurements on single neurons, whereas perceptual performance is likely to be based on a population analysis. In combining the information of neuron ensembles, the determinant of CMR might be either the neuron or neurons providing the best signal detectability (the lower envelope principle) or some kind of gross average (pooling) (Parker and Newsome, 1998). Second, there might be interspecies differences in the magnitude of CMR, i.e., a difference between the amount of CMR in humans and guinea pigs. Even in studies using similar paradigms in the same species, a difference between the psychophysical and average physiological masking release is found (Langemann and Klump, 2001; Nieder and Klump, 2001). Third, the present recordings have been made at an early processing level, and the d′values we obtained are always high. It should be noted, however, that these d′ values represent the best theoretical performance at this stage and do not take into account higher stages at which information may be processed suboptimally. In the d′statistic, any positive or negative difference between discharge rates improves detection, whereas only a subset of cues might be effective to perceptually detect a signal.
The simple neural circuit proposed in Figure 9 would be consistent, at least qualitatively, with many psychophysical observations on CMR. Such a circuit would yield similar enhancement for both band-widening and band-combining experiments (Hall et al., 1984). Although the band-widening paradigm probably relies, in part, on within-channel cues (Carlyon et al., 1989; Verhey et al., 1999), the across-frequency component of CMR in band-combining experiments is substantial (∼10 dB ) (Cohen and Schubert, 1987; Grose and Hall, 1989; Moore et al., 1990), it persists over a 3 octave frequency separation range (Cohen, 1991), and it cannot be predicted by single-channel models (Verhey et al., 1999). The circuit could provide a basis for such an across-frequency component. The circuit also suggests a unified explanation for both CMR and across-channel masking (ACM) observed in CD conditions (Moore et al. 1990) because inhibition occurs on a moment-to-moment basis and thus depends on the phase of the FCs. Grose and Hall (1989) and Moore et al. (1990) have shown, respectively, that CMR increases with modulation depth and that ACM requires modulation. In our circuit, the wideband inhibitor crucial to the CMR and ACM effects is an onset-type of unit that would respond well to modulated sounds, but not to steady-state ones. Hall et al. (1990) have shown that CD components proximal to the signal could disrupt CMR; it is likely that they would also disrupt the onset envelope-following response. CMR can also be obtained when using dichotic presentation (Schooneveldt and Moore, 1987), but this does not preclude a role for the VCN, because it has been suggested (Joris and Smith, 1998) that the units identified as wideband inhibitors may project to the contralateral cochlear nucleus. In summary, our data support a possible physiological implementation for an equalization–cancellation model of CMR: peripheral compression and the properties of the onset unit provide equalization, and inhibitory projections provide cancellation.
Finally, it should be noted that we do not suggest that CMR is attributable entirely to the VCN circuit proposed above. However, the circuit proposed here provides a simple solution by which early across-frequency processing could be achieved within the auditory system in a way that is beneficial to the detection of signals embedded in broad-band, comodulated noise.
Footnotes
This work was supported by the Wellcome Trust. D.P. is currently supported by the Centre National de la Recherche Scientifique. We thank Jesko Verhey and two anonymous reviewers for helpful comments on this manuscript.
Correspondence should be addressed to Daniel Pressnitzer, Institut de Recherche et Coordination Acoustique/Musique–Centre National de la Recherche Scientifique, Unité Mixte Recherche 9912, 1 place Stravinsky, 75004 Paris, France. E-mail: Daniel.Pressnitzer@ircam.fr.
REFERENCES
- 1.Arnott RH, Wallace MN, Palmer AR. Innervation of the ventral and dorsal cochlear nuclei by an onset cell in the anteroventral cochlear nucleus. Br J Audiol. 2001;35:121. [Google Scholar]
- 2.Blackburn CC, Sachs MB. Classification of unit types in the anteroventral cochlear nucleus: PST histograms and regularity analysis. J Neurophysiol. 1989;62:1303–1329. doi: 10.1152/jn.1989.62.6.1303. [DOI] [PubMed] [Google Scholar]
- 3.Blackburn CC, Sachs MB. The representations of the steady-state vowel/e/in the discharge patterns of cat anteroventral cochlear nucleus neurons. J Neurophysiol. 1990;63:1191–1212. doi: 10.1152/jn.1990.63.5.1191. [DOI] [PubMed] [Google Scholar]
- 4.Blackburn CC, Sachs MB. Effects of off-BF tones on responses of chopper units in ventral cochlear nucleus. I. Regularity and temporal adaptation patterns. J Neurophysiol. 1992;68:124–143. doi: 10.1152/jn.1992.68.1.124. [DOI] [PubMed] [Google Scholar]
- 5.Buus S. Release from masking caused by envelope fluctuations. J Acoust Soc Am. 1985;78:1958–1965. doi: 10.1121/1.392652. [DOI] [PubMed] [Google Scholar]
- 6.Carlyon RP, Buus S, Florentine M. Comodulation masking release for three types of modulator as a function of modulation rate. Hear Res. 1989;42:37–46. doi: 10.1016/0378-5955(89)90116-0. [DOI] [PubMed] [Google Scholar]
- 7.Cohen MF. Comodulation masking release over a three octave. J Acoust Soc Am. 1991;90:1381–1384. doi: 10.1121/1.401929. [DOI] [PubMed] [Google Scholar]
- 8.Cohen MF, Schubert ED. Influence of place synchrony on the detection of a sinusoid. J Acoust Soc Am. 1987;81:452–458. doi: 10.1121/1.394910. [DOI] [PubMed] [Google Scholar]
- 9.Delahaye R. PhD thesis. University of Essex; 1999. Across-channel effects on masked signal thresholds. [Google Scholar]
- 10.Evans EF. Latest comparisons between physiological and behavioural frequency selectivity. In: Breebaart DJ, Houtsma AJM, Kohlrausch A, Prijs VF, Schoonhoven R, editors. Physiological and psychophysical bases of auditory function. Shaker; Maastricht, The Netherlands: 2001. pp. 382–387. [Google Scholar]
- 11.Evans EF, Zhao W. Proceedings of the Twenty-First Midwinter Research Meeting of the Association for Research in Otolaryngology. St. Petersburg, FL, February; 1998. Integration and coincidence mechanisms in onset units in guinea pig ventral cochlear nucleus. p. 115. [Google Scholar]
- 12.Fantini DA, Moore BCJ, Schooneveldt GP. Comodulation masking release as a function of type of signal, gated or continuous masking, monaural or dichotic presentation of flanking bands, and center frequency. J Acoust Soc Am. 1993;93:2106–2115. doi: 10.1121/1.406697. [DOI] [PubMed] [Google Scholar]
- 13.Ferragamo MJ, Golding NL, Oertel D. Synaptic inputs to stellate cells in the ventral cochlear nucleus. J Neurophysiol. 1998;79:51–63. doi: 10.1152/jn.1998.79.1.51. [DOI] [PubMed] [Google Scholar]
- 14.Godfrey DA, Kiang NYS, Norris BE. Single unit activity in the posteroventral cochlear nucleus of the cat. J Comp Neurol. 1975;162:247–268. doi: 10.1002/cne.901620206. [DOI] [PubMed] [Google Scholar]
- 15.Gralla G. PhD dissertation. Technical University Munich; 1991. Wahrnehmungskriterien bei Mithörschwellenmessungen un deren Simulation in Rechnermodellen. [Google Scholar]
- 16.Grose JH, Hall JW. Comodulation masking release using SAM tonal complex maskers: effects of modulation depth and signal position. J Acoust Soc Am. 1989;85:1276–1284. doi: 10.1121/1.397458. [DOI] [PubMed] [Google Scholar]
- 17.Grose JH, Hall JW. Comodulation masking release: Is comodulation sufficient? J Acoust Soc Am. 1993;93:2896–2902. doi: 10.1121/1.405809. [DOI] [PubMed] [Google Scholar]
- 18.Hall JW, Grose JH. Comodulation masking release: Evidence for multiple cues. J Acoust Soc Am. 1988;84:1669–1675. doi: 10.1121/1.397182. [DOI] [PubMed] [Google Scholar]
- 19.Hall JW, Haggard MP, Fernandes MA. Detection in noise by spectro-temporal pattern analysis J. Acoust Soc Am. 1984;76:50–56. doi: 10.1121/1.391005. [DOI] [PubMed] [Google Scholar]
- 20.Hall JW, Grose JH, Haggard MP. Effects of flanking band proximity, number and modulation pattern on comodulation masking release. J Acoust Soc Am. 1990;87:269–283. doi: 10.1121/1.399294. [DOI] [PubMed] [Google Scholar]
- 21.Hall JW, Grose JH, Mendoza L. Across-channel processes in masking. In: Moore BCJ, editor. Hearing. Academic; San Diego: 1995. pp. 243–266. [Google Scholar]
- 22.Hewitt MJ, Meddis R. Regularity of cochlear nucleus stellate cells: a computational modeling study. J Acoust Soc Am. 1993;93:3390–3399. doi: 10.1121/1.405694. [DOI] [PubMed] [Google Scholar]
- 23.Jiang D, Palmer AR, Winter IM. The frequency extent of two tone facilitation in onset units in the ventral cochlear nucleus. J Neurophysiol. 1996;75:380–396. doi: 10.1152/jn.1996.75.1.380. [DOI] [PubMed] [Google Scholar]
- 24.Joris PX, Smith PH. Temporal and binaural properties in dorsal cochlear nucleus and its output tract. J Neurosci. 1998;18:10157–10170. doi: 10.1523/JNEUROSCI.18-23-10157.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Klump GM. Bird communication in the noisy world. In: Kroodsma DE, Miller EH, editors. Ecology and evolution of acoustic communication in birds. Comstock; Ithaca, NY: 1996. pp. 321–338. [Google Scholar]
- 26.Klump GM, Langemann U. Comodulation masking release in a songbird. Hear Res. 1995;87:157–164. doi: 10.1016/0378-5955(95)00087-k. [DOI] [PubMed] [Google Scholar]
- 27.Klump GM, Kittel M, Wagner E (2001) Comodulation masking release in the Mongolian gerbil. Abstracts of the Twenty-Fourth Midwinter Research Meeting of the Association for Research in Otolaryngology, p 84, St. Petersburg, FL, February.
- 28.Langemann U, Klump GM. Signal detection in amplitude-modulated maskers. I. Behavioural auditory thresholds in a songbird. Eur J Neurosci. 2001;13:1025–1032. doi: 10.1046/j.0953-816x.2001.01464.x. [DOI] [PubMed] [Google Scholar]
- 29.Lorente de Nó R. The primary acoustic nuclei. Raven; New York: 1981. [Google Scholar]
- 30.MacGregor RJ. Neural and brain modelling. Academic; San Diego: 1987. [Google Scholar]
- 31.Macmillan NA, Creelman CD. Detection theory: a user's guide. Cambridge UP; Cambridge, UK: 1991. [Google Scholar]
- 32.McFadden D, Wright BA. Comodulation masking release in a forward-masking paradigm. J Acoust Soc Am. 1987;82:1615–1620. doi: 10.1121/1.395152. [DOI] [PubMed] [Google Scholar]
- 33.Meddis R, Hewitt MJ, Shackleton TM. Implementation details of a computational model of the inner hair cell/auditory nerve synapse. J Acoust Soc Am. 1990;87:1813–1818. [Google Scholar]
- 34.Merrill EG, Ainsworth A. Glass coated platinum tipped tungsten microelectrodes. Med Biol Eng. 1972;10:662–672. doi: 10.1007/BF02476084. [DOI] [PubMed] [Google Scholar]
- 35.Moore BCJ, Glasberg BR, Schooneveldt GP. Across-channel masking and comodulation masking release. J Acoust Soc Am. 1990;87:1683–1694. doi: 10.1121/1.399416. [DOI] [PubMed] [Google Scholar]
- 36.Mott JB, McDonald LP, Sinex DG. Neural correlates of psychophysical release from masking. J Acoust Soc Am. 1990;88:2682–2691. doi: 10.1121/1.399987. [DOI] [PubMed] [Google Scholar]
- 37.Nelken I, Young ED. Two separate inhibitory mechanisms shape the responses of dorsal cochlear nucleus type IV units to narrowband and broadband noise. J Neurophysiol. 1994;71:2446–2462. doi: 10.1152/jn.1994.71.6.2446. [DOI] [PubMed] [Google Scholar]
- 38.Nelken I, Rotman Y, Yosef OB. Responses of auditory-cortex neurons to structural features of natural sounds. Nature. 1999;397:154–157. doi: 10.1038/16456. [DOI] [PubMed] [Google Scholar]
- 39.Nieder A, Klump GM. Signal detection in amplitude-modulated maskers. II. Processing in the songbird′s auditory forebrain. Eur J Neurosci. 2001;13:1033–1044. doi: 10.1046/j.0953-816x.2001.01465.x. [DOI] [PubMed] [Google Scholar]
- 40.Oertel D, Wu SH, Garb MW, Dizack C. Morphology and physiology of cells in slice preparations of the posteroventral cochlear nucleus of mice. J Comp Neurol. 1990;295:136–154. doi: 10.1002/cne.902950112. [DOI] [PubMed] [Google Scholar]
- 41.Parker AJ, Newsome WT. Sense and the single neuron: probing the physiology of perception. Annu Rev Neurosci. 1998;21:227–277. doi: 10.1146/annurev.neuro.21.1.227. [DOI] [PubMed] [Google Scholar]
- 42.Rhode WS, Smith PH, Oertel D. Physiological response properties of cells labelled intracellularly with horseradish peroxidase in the ventral cochlear nucleus. J Comp Neurol. 1983;213:426–447. doi: 10.1002/cne.902130407. [DOI] [PubMed] [Google Scholar]
- 43.Richards DG, Wiley RH. Reverberations and amplitude fluctuations in the propagation of sound in a forest: implication for animal communication. Phys Rev Lett. 1980;73:814–817. [Google Scholar]
- 44.Schooneveldt GP, Moore BCJ. Comodulation masking release (CMR): effect of signal frequency, flanking band frequency, masker bandwidth, flanking-band level, and monotic versus dichotic presentation of the flanking band. J Acoust Soc Am. 1987;82:1944–1956. doi: 10.1121/1.395639. [DOI] [PubMed] [Google Scholar]
- 45.Shore SE, Helfert RH, Bledsoe SC, Altschuler RA, Godfrey DA. Descending projections to the dorsal and ventral divisions of the cochlear nucleus in the guinea pig. Hear Res. 1991;52:255–268. doi: 10.1016/0378-5955(91)90205-n. [DOI] [PubMed] [Google Scholar]
- 46.Smith PH, Rhode WS. Structural and functional properties distinguish two types of multipolar cells in the ventral cochlear nucleus. J Comp Neurol. 1989;282:595–616. doi: 10.1002/cne.902820410. [DOI] [PubMed] [Google Scholar]
- 47.Smith PH, Joris PX, Banks MI, Yin TCT. Responses of cochlear nucleus cells and projections of their axons. In: Merchan MA, Juiz JM, Godfrey DA, Mugnaini E, editors. The mammalian cochlear nuclei: organization and function. Plenum; New York: 1993. pp. 349–360. [Google Scholar]
- 48.Stabler SE, Palmer AR, Winter IM. Temporal and mean rate discharge patterns of single units in the dorsal cochlear nucleus of the anaesthetised guinea pig. J Neurophysiol. 1996;76:1677–1688. doi: 10.1152/jn.1996.76.3.1667. [DOI] [PubMed] [Google Scholar]
- 49.Verhey JL, Dau T, Kollmeier B. Within-channel cues in comodulation masking release (CMR): experiments and model predictions using a modulation filter bank model. J Acoust Soc Am. 1999;106:2733–2745. doi: 10.1121/1.428101. [DOI] [PubMed] [Google Scholar]
- 50.Winter IM, Palmer AR. Responses of single units in the anteroventral cochlear of the guinea pig. Hear Res. 1990a;44:161–178. doi: 10.1016/0378-5955(90)90078-4. [DOI] [PubMed] [Google Scholar]
- 51.Winter IM, Palmer AR. Temporal responses of primarylike cochlear nucleus units to the steady-state vowel/i/. J Acoust Soc Am. 1990b;88:1437–1441. doi: 10.1121/1.399720. [DOI] [PubMed] [Google Scholar]
- 52.Winter IM, Palmer AR. Level dependence of cochlear nucleus onset unit responses and facilitation by second tones or broadband noise. J Neurophysiol. 1995;73:141–159. doi: 10.1152/jn.1995.73.1.141. [DOI] [PubMed] [Google Scholar]
- 53.Young ED, Robert JM, Shofner WP. Regularity and latency of units in ventral cochlear nucleus: implications for unit classification and generation of response properties. J Neurophysiol. 1988;60:1–29. doi: 10.1152/jn.1988.60.1.1. [DOI] [PubMed] [Google Scholar]