Representation of CV-sounds in cat primary auditory cortex: intensity dependence

doi:10.1016/S0167-6393(02)00096-1

Speech Communication

Volume 41, Issue 1, August 2003, Pages 93-106

https://doi.org/10.1016/S0167-6393(02)00096-1 Get rights and content

Abstract

The level-dependent representation of simple speech sounds in cat primary auditory cortex (AI) is explored in naive cats and in animals that have been exposed to these sounds in behavioral detection and discrimination tasks. Population analyses of multiple unit responses in the form of post-stimulus time histograms (PSTHs), neurograms, and spatial distribution were made for synthetic consonant–vowel sounds across AI. The temporal profile of cortical responses was robust across neurons, characterized by brief phasic responses at the onset of consonantal burst and voicing. The spectral profile of the sounds, i.e., the formant structure, was only weakly expressed in the response magnitude across characteristic frequency. The spatial response distribution across AI was discontinuous, and consisted of several patches of activation. Intensity-dependence in the spatial activity distribution was more strongly expressed than in population PSTHs and neurograms. Differences attributable to behavioral training were observed for rate-encoding and temporal encoding of speech sounds.

Introduction

In the analysis of neural representations of speech-like sounds and animal vocalizations, the spectral and temporal domains of the acoustic signal have emerged as the basic stimulus features represented at various stages of the auditory nervous system. Spectral and temporal characteristics of vowel sounds are also manifest in the firing rate and temporal response pattern of cat auditory nerve fibers (e.g., Delgutte and Kiang, 1984; Sachs and Young, 1979; Sinex and Geisler, 1983), cochlear nucleus neurons of cats (e.g., Blackburn and Sachs, 1990; Wang and Sachs, 1994) and in the inferior colliculus (Chen et al., 1996). In the primary auditory cortex (AI) in marmoset monkeys, spectro-temporal characteristics of species-specific vocalizations are integrated temporally and spectrally and represented by the synchronization of neural activity from spatially dispersed cortical cell assemblies (Wang et al., 1995; Nagarajan et al., 2002).

In the study of communication calls, the issue of behavioral relevance of the vocalization has been proposed as a factor in shaping neural representations of stimulus features at the cortex due to significant influences from auditory environment and experience (Merzenich et al., 1988, Merzenich et al., 1990; Wang, 2000; Wang and Kadia, 2001). A perceptual feature of speech syllables, voice-onset time (VOT), can be represented by time-locked activity of neurons in monkeys AI (Steinschneider et al., 1982, Steinschneider et al., 1990, Steinschneider et al., 1994, Steinschneider et al., 1992, 2000) and cats (Eggermont, 1995). In these experimental studies, however, animals were not trained or exposed to the speech stimuli to give these physiological responses behavioral salience. Thus, the findings from such studies reflect the ability of subsets of AI neurons to respond to changes in the acoustics of the speech stimuli without specific adaptation of the neural circuitry to the sounds. Addition of a behavioral component to physiological experiments examining the neural representations of speech stimuli provides behavioral salience of the sounds and may reveal specific adaptations to trained stimuli.

Therefore, the first objective in this study was to determine the distributed representation of speech syllables across many cells in AI; and second, to explore how behavioral training modifies the distributed and cumulative cortical representation of speech sounds.

Section snippets

Speech stimuli

Four consonant–vowel (CV) stimuli, /be/, /pe/, /ke/, and /ko/, were synthesized using a Klatt-model speech synthesizer (SenSyn). Each CV stimulus was 250 ms in duration. The fundamental frequency declined linearly from 120 to 100 Hz. The beginning and endpoints of the three formant frequencies for each CV stimulus were defined by:

/be/: F1(350–550 Hz), F2(1400–1700 Hz), F3(2100–2500 Hz)
/pe/: F1(NV–550 Hz), F2(1400–1700 Hz), F3(2100–2500 Hz)

where NV is the absence of voicing; there was no F1 at

Results

Electrophysiological data from seven adult cats was available. Comparison of data from three naive and four speech-trained cats was made to examine whether the behavioral relevance of selected speech sounds affected relevant parts of their cortical representation. Recordings sampled the low-frequency region of AI evenly between 0.5 and 4 kHz. The number of recording locations/animal varied between 80 and 145. Here we report the level-dependent cortical responses to the CV sounds /be/ and /pe/,

Discussion

These experiments had two goals: To use the distributed nature of receptive field properties of AI neurons as a tool for studying speech syllable representation, and to compare the effect of behavioral training on such representation. The multi-unit mapping approach was not only necessary to acquire an adequate cell sample per animal, but it also allowed the comparison of response strength relative to cell position in cortical space for each stimulus presentation condition. Satisfying this

Acknowledgements

We thank Dr. Ben Bonham for assistance during some experiments and Dr. Jeffery Winer for many comments on the manuscript. Supported by grants NINDS 34835, NIDCD 02260, NSF REC 97203398, the Coleman Fund and Hearing Research Inc.

References (37)

D.D Gehr et al.
Neuronal responses in cat primary auditory cortex to natural and altered species-specific calls
Hearing Research
(2000)
P Heil et al.
Topographic representation of tone intensity along the iso-frequency axis of cat primary auditory cortex
Hearing Research
(1994)
P.K Kuhl
Learning and representation in speech and language
Current Opinion in Neurobiology
(1994)
D.P Phillips et al.
Neurons in the cat’s primary auditory cortex distinguished by their responses to tones and wide-spectrum noise
Hearing Research
(1985)
M Steinschneider et al.
Speech evoked activity in the auditory radiations and cortex of the awake monkey
Brain Research
(1982)
M Steinschneider et al.
Tonotopic features of speech-evoked activity in primate auditory cortex
Brain Research
(1990)
M Steinschneider et al.
Speech-evoked activity in primary auditory cortex: effects of voice onset time
Electroencephalography and Clinical Neurophysiology
(1994)
E Ahissar et al.
Dependence of cortical plasticity on correlated activity of single neurons and on behavioral context
Science
(1992)
C.C Blackburn et al.
The representations of the steady-state vowel sound /e/ in the discharge patterns of cat anteroventral cochlear nucleus neurons
Journal of Neurophysiology
(1990)
M Brosch et al.
Time course of forward masking tuning curves in cat primary auditory cortex
Journal of Neurophysiology
(1997)

M Brosch et al.

Sequence selectivity of neurons in cat primary auditory cortex

Cerebral Cortex

(2000)

J.F Brugge et al.

Responses of neurons in auditory cortex of the macaque monkey to monaural and binaural stimulation

Journal of Neurophysiology

(1973)

M.B Calford et al.

Monaural inhibition in cat auditory cortex

Journal of Neurophysiology

(1995)

G.D Chen et al.

Responses of single neurons in the chinchilla inferior colliculus to consonant–vowel syllables differing in voice onset time

Auditory Neuroscience

(1996)

B Delgutte et al.

Speech coding in the auditory nerve: I. Vowel-like sounds

Journal of the Acoustical Society of America

(1984)

J.J Eggermont

Representation of a voice onset time continuum in primary auditory cortex of the cat

Journal of the Acoustical Society of America

(1995)

N Kowalski et al.

Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra

Journal of Neurophysiology

(1996)

P.K Kuhl

Discrimination of speech by nonhuman animals: basic auditory sensitivities conducive to the perception of speech-sound categories

Journal of the Acoustical Society of America

(1981)

Cited by (21)

Selectivity to acoustic features of human speech in the auditory cortex of the mouse
2024, Hearing Research
A better understanding of the neural mechanisms of speech processing can have a major impact in the development of strategies for language learning and in addressing disorders that affect speech comprehension. Technical limitations in research with human subjects hinder a comprehensive exploration of these processes, making animal models essential for advancing the characterization of how neural circuits make speech perception possible. Here, we investigated the mouse as a model organism for studying speech processing and explored whether distinct regions of the mouse auditory cortex are sensitive to specific acoustic features of speech. We found that mice can learn to categorize frequency-shifted human speech sounds based on differences in formant transitions (FT) and voice onset time (VOT). Moreover, neurons across various auditory cortical regions were selective to these speech features, with a higher proportion of speech-selective neurons in the dorso-posterior region. Last, many of these neurons displayed mixed-selectivity for both features, an attribute that was most common in dorsal regions of the auditory cortex. Our results demonstrate that the mouse serves as a valuable model for studying the detailed mechanisms of speech feature encoding and neural plasticity during speech-sound learning.
2.33 - Primary Auditory Cortex II. Some Functional Considerations
2020, The Senses: A Comprehensive Reference: Volume 1-7, Second Edition
Primary auditory cortex significantly expands the magnitude and variety of its functional processing repertoire over subcortical stations due to its wide convergence of inputs, sophistication of local circuitry, and high degree of functional plasticity. While these aspects enable the cortex to respond in more varied ways to sound features, it also appears to be more susceptible to influences dictated by the conditions under which sounds are received such as sensory context, behavioral significance, and cognitive goals. Accordingly, progress in understanding the functional and perceptual contributions of primary auditory cortex depends on an expanded range of approaches that identify and characterize the strong modulatory influences inherent in the different tasks that are contributed by cortical processing.
Speech training alters tone frequency tuning in rat primary auditory cortex
2014, Behavioural Brain Research
Citation Excerpt :
For example, monkeys that were trained for months to discriminate tone frequency have more A1 neurons responding to the trained frequency compared to untrained monkeys [12]. Our finding that learning occurs without trained sound specific A1 plasticity is similar to previous findings that stimulus specific plasticity in A1 is not needed to maintain enhanced performance [14,25,43–45]. Under some training conditions, map reorganization is followed by a return to normal topography without a behavioral decrement [25,46,47].
Previous studies in both humans and animals have documented improved performance following discrimination training. This enhanced performance is often associated with cortical response changes. In this study, we tested the hypothesis that long-term speech training on multiple tasks can improve primary auditory cortex (A1) responses compared to rats trained on a single speech discrimination task or experimentally naïve rats. Specifically, we compared the percent of A1 responding to trained sounds, the responses to both trained and untrained sounds, receptive field properties of A1 neurons, and the neural discrimination of pairs of speech sounds in speech trained and naïve rats. Speech training led to accurate discrimination of consonant and vowel sounds, but did not enhance A1 response strength or the neural discrimination of these sounds. Speech training altered tone responses in rats trained on six speech discrimination tasks but not in rats trained on a single speech discrimination task. Extensive speech training resulted in broader frequency tuning, shorter onset latencies, a decreased driven response to tones, and caused a shift in the frequency map to favor tones in the range where speech sounds are the loudest. Both the number of trained tasks and the number of days of training strongly predict the percent of A1 responding to a low frequency tone. Rats trained on a single speech discrimination task performed less accurately than rats trained on multiple tasks and did not exhibit A1 response changes. Our results indicate that extensive speech training can reorganize the A1 frequency map, which may have downstream consequences on speech sound processing.
Increasing diversity of neural responses to speech sounds across the central auditory pathway
2013, Neuroscience
Citation Excerpt :
Further experimentation using methods that allow systematic manipulation of specific acoustic features (e.g., Klatt synthesizer) (Klatt, 1980), would be a fascinating approach to tap into the properties that contribute directly to increased response diversity in A1 compared to IC. Since the first demonstration of behavioral speech discrimination by chinchilla, several studies have reported rats, cats, monkeys and birds ability to reliably discriminate vowels and consonants (Kuhl and Miller, 1975; Kluender et al., 1987; Dooling et al., 1989; Ramus et al., 2000; Reed et al., 2003; Wong and Schreiner, 2003; Porter et al., 2011). Neurophysiological recordings in animals have shown that activity patterns evoked by speech sounds in auditory neurons are well correlated with behavioral discrimination of speech (Engineer et al., 2008; Perez et al., 2012; Centanni et al., 2013).
Neurons at higher stations of each sensory system are responsive to feature combinations not present at lower levels. As a result, the activity of these neurons becomes less redundant than lower levels. We recorded responses to speech sounds from the inferior colliculus and the primary auditory cortex neurons of rats, and tested the hypothesis that primary auditory cortex neurons are more sensitive to combinations of multiple acoustic parameters compared to inferior colliculus neurons. We independently eliminated periodicity information, spectral information and temporal information in each consonant and vowel sound using a noise vocoder. This technique made it possible to test several key hypotheses about speech sound processing. Our results demonstrate that inferior colliculus responses are spatially arranged and primarily determined by the spectral energy and the fundamental frequency of speech, whereas primary auditory cortex neurons generate widely distributed responses to multiple acoustic parameters, and are not strongly influenced by the fundamental frequency of speech. We found no evidence that inferior colliculus or primary auditory cortex was specialized for speech features such as voice onset time or formants. The greater diversity of responses in primary auditory cortex compared to inferior colliculus may help explain how the auditory system can identify a wide range of speech sounds across a wide range of conditions without relying on any single acoustic cue.
From syntax to acoustic duration: A dynamical model of speech rhythm production
2007, Speech Communication
Citation Excerpt :
Note that the use of this unit in speech research can be traced back at least as far as in Lehiste (1970) and Classe (1939). As regards the relevance of V-to-V units for both speech production and perception, see: the literature on p-centers, e.g. Marcus (1981), Pompino-Marschall (1991); the psycholinguistic advantages of CV transition tracking, e.g. Dogil and Braun (1988); as well as the robustness of CV transition detection throughout the mammalian auditory pathway, e.g. Wong and Schreiner (2003).) The choice of V-to-V instead of VC for the unit name is motivated for two reasons: to avoid the (possible) association of VC with tautosyllabic units only, and to remind the relevance of the vowel flow both for speech production and for the model presented here (see syllabic oscillator in Section 4).
This paper presents a speech rhythm production model able to generate segmental acoustic duration from several levels of dynamical coupling between linguistic and production-related subsystems. A probabilistic algorithm for phrase stress assignment accounts for both prominence and constituency prosodic relations by considering the coupling between a dependency-grammar system of markers and constituent-size constraints. This algorithm copes with intra- and inter-speaker prosodic variability. Having as input the position and magnitude of underlying phrase stress, and a set of dynamical control parameters, the model acts at three nested temporal domains to assign segmental duration in Brazilian Portuguese. The modelled V-to-V duration patterns reproduce the patterns found at the surface under several conditions of perturbation. The nature and advantages of the dynamical model of speech rhythm production for simulating natural data are thoroughly discussed.
Speech Processing in the Auditory System
2022, Research Advances in Communication Studies

View all citing articles on Scopus

View full text

Representation of CV-sounds in cat primary auditory cortex: intensity dependence

Abstract

Introduction

Section snippets

Speech stimuli

Results

Discussion

Acknowledgements

Hearing Research

Hearing Research

Current Opinion in Neurobiology

Hearing Research

Brain Research

Brain Research

Electroencephalography and Clinical Neurophysiology

Dependence of cortical plasticity on correlated activity of single neurons and on behavioral context

Science

The representations of the steady-state vowel sound /e/ in the discharge patterns of cat anteroventral cochlear nucleus neurons

Journal of Neurophysiology

Time course of forward masking tuning curves in cat primary auditory cortex

Journal of Neurophysiology

Sequence selectivity of neurons in cat primary auditory cortex

Cerebral Cortex

Responses of neurons in auditory cortex of the macaque monkey to monaural and binaural stimulation

Journal of Neurophysiology

Monaural inhibition in cat auditory cortex

Journal of Neurophysiology

Responses of single neurons in the chinchilla inferior colliculus to consonant–vowel syllables differing in voice onset time

Auditory Neuroscience

Speech coding in the auditory nerve: I. Vowel-like sounds

Journal of the Acoustical Society of America

Representation of a voice onset time continuum in primary auditory cortex of the cat

Journal of the Acoustical Society of America

Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra

Journal of Neurophysiology

Discrimination of speech by nonhuman animals: basic auditory sensitivities conducive to the perception of speech-sound categories

Journal of the Acoustical Society of America