Elsevier

Speech Communication

Volume 41, Issue 1, August 2003, Pages 93-106
Speech Communication

Representation of CV-sounds in cat primary auditory cortex: intensity dependence

https://doi.org/10.1016/S0167-6393(02)00096-1Get rights and content

Abstract

The level-dependent representation of simple speech sounds in cat primary auditory cortex (AI) is explored in naive cats and in animals that have been exposed to these sounds in behavioral detection and discrimination tasks. Population analyses of multiple unit responses in the form of post-stimulus time histograms (PSTHs), neurograms, and spatial distribution were made for synthetic consonant–vowel sounds across AI. The temporal profile of cortical responses was robust across neurons, characterized by brief phasic responses at the onset of consonantal burst and voicing. The spectral profile of the sounds, i.e., the formant structure, was only weakly expressed in the response magnitude across characteristic frequency. The spatial response distribution across AI was discontinuous, and consisted of several patches of activation. Intensity-dependence in the spatial activity distribution was more strongly expressed than in population PSTHs and neurograms. Differences attributable to behavioral training were observed for rate-encoding and temporal encoding of speech sounds.

Introduction

In the analysis of neural representations of speech-like sounds and animal vocalizations, the spectral and temporal domains of the acoustic signal have emerged as the basic stimulus features represented at various stages of the auditory nervous system. Spectral and temporal characteristics of vowel sounds are also manifest in the firing rate and temporal response pattern of cat auditory nerve fibers (e.g., Delgutte and Kiang, 1984; Sachs and Young, 1979; Sinex and Geisler, 1983), cochlear nucleus neurons of cats (e.g., Blackburn and Sachs, 1990; Wang and Sachs, 1994) and in the inferior colliculus (Chen et al., 1996). In the primary auditory cortex (AI) in marmoset monkeys, spectro-temporal characteristics of species-specific vocalizations are integrated temporally and spectrally and represented by the synchronization of neural activity from spatially dispersed cortical cell assemblies (Wang et al., 1995; Nagarajan et al., 2002).

In the study of communication calls, the issue of behavioral relevance of the vocalization has been proposed as a factor in shaping neural representations of stimulus features at the cortex due to significant influences from auditory environment and experience (Merzenich et al., 1988, Merzenich et al., 1990; Wang, 2000; Wang and Kadia, 2001). A perceptual feature of speech syllables, voice-onset time (VOT), can be represented by time-locked activity of neurons in monkeys AI (Steinschneider et al., 1982, Steinschneider et al., 1990, Steinschneider et al., 1994, Steinschneider et al., 1992, 2000) and cats (Eggermont, 1995). In these experimental studies, however, animals were not trained or exposed to the speech stimuli to give these physiological responses behavioral salience. Thus, the findings from such studies reflect the ability of subsets of AI neurons to respond to changes in the acoustics of the speech stimuli without specific adaptation of the neural circuitry to the sounds. Addition of a behavioral component to physiological experiments examining the neural representations of speech stimuli provides behavioral salience of the sounds and may reveal specific adaptations to trained stimuli.

Therefore, the first objective in this study was to determine the distributed representation of speech syllables across many cells in AI; and second, to explore how behavioral training modifies the distributed and cumulative cortical representation of speech sounds.

Section snippets

Speech stimuli

Four consonant–vowel (CV) stimuli, /be/, /pe/, /ke/, and /ko/, were synthesized using a Klatt-model speech synthesizer (SenSyn). Each CV stimulus was 250 ms in duration. The fundamental frequency declined linearly from 120 to 100 Hz. The beginning and endpoints of the three formant frequencies for each CV stimulus were defined by:

  • /be/: F1(350–550 Hz), F2(1400–1700 Hz), F3(2100–2500 Hz)

  • /pe/: F1(NV–550 Hz), F2(1400–1700 Hz), F3(2100–2500 Hz)


where NV is the absence of voicing; there was no F1 at

Results

Electrophysiological data from seven adult cats was available. Comparison of data from three naive and four speech-trained cats was made to examine whether the behavioral relevance of selected speech sounds affected relevant parts of their cortical representation. Recordings sampled the low-frequency region of AI evenly between 0.5 and 4 kHz. The number of recording locations/animal varied between 80 and 145. Here we report the level-dependent cortical responses to the CV sounds /be/ and /pe/,

Discussion

These experiments had two goals: To use the distributed nature of receptive field properties of AI neurons as a tool for studying speech syllable representation, and to compare the effect of behavioral training on such representation. The multi-unit mapping approach was not only necessary to acquire an adequate cell sample per animal, but it also allowed the comparison of response strength relative to cell position in cortical space for each stimulus presentation condition. Satisfying this

Acknowledgements

We thank Dr. Ben Bonham for assistance during some experiments and Dr. Jeffery Winer for many comments on the manuscript. Supported by grants NINDS 34835, NIDCD 02260, NSF REC 97203398, the Coleman Fund and Hearing Research Inc.

References (37)

  • M Brosch et al.

    Sequence selectivity of neurons in cat primary auditory cortex

    Cerebral Cortex

    (2000)
  • J.F Brugge et al.

    Responses of neurons in auditory cortex of the macaque monkey to monaural and binaural stimulation

    Journal of Neurophysiology

    (1973)
  • M.B Calford et al.

    Monaural inhibition in cat auditory cortex

    Journal of Neurophysiology

    (1995)
  • G.D Chen et al.

    Responses of single neurons in the chinchilla inferior colliculus to consonant–vowel syllables differing in voice onset time

    Auditory Neuroscience

    (1996)
  • B Delgutte et al.

    Speech coding in the auditory nerve: I. Vowel-like sounds

    Journal of the Acoustical Society of America

    (1984)
  • J.J Eggermont

    Representation of a voice onset time continuum in primary auditory cortex of the cat

    Journal of the Acoustical Society of America

    (1995)
  • N Kowalski et al.

    Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra

    Journal of Neurophysiology

    (1996)
  • P.K Kuhl

    Discrimination of speech by nonhuman animals: basic auditory sensitivities conducive to the perception of speech-sound categories

    Journal of the Acoustical Society of America

    (1981)
  • Cited by (21)

    • 2.33 - Primary Auditory Cortex II. Some Functional Considerations

      2020, The Senses: A Comprehensive Reference: Volume 1-7, Second Edition
    • Speech training alters tone frequency tuning in rat primary auditory cortex

      2014, Behavioural Brain Research
      Citation Excerpt :

      For example, monkeys that were trained for months to discriminate tone frequency have more A1 neurons responding to the trained frequency compared to untrained monkeys [12]. Our finding that learning occurs without trained sound specific A1 plasticity is similar to previous findings that stimulus specific plasticity in A1 is not needed to maintain enhanced performance [14,25,43–45]. Under some training conditions, map reorganization is followed by a return to normal topography without a behavioral decrement [25,46,47].

    • Increasing diversity of neural responses to speech sounds across the central auditory pathway

      2013, Neuroscience
      Citation Excerpt :

      Further experimentation using methods that allow systematic manipulation of specific acoustic features (e.g., Klatt synthesizer) (Klatt, 1980), would be a fascinating approach to tap into the properties that contribute directly to increased response diversity in A1 compared to IC. Since the first demonstration of behavioral speech discrimination by chinchilla, several studies have reported rats, cats, monkeys and birds ability to reliably discriminate vowels and consonants (Kuhl and Miller, 1975; Kluender et al., 1987; Dooling et al., 1989; Ramus et al., 2000; Reed et al., 2003; Wong and Schreiner, 2003; Porter et al., 2011). Neurophysiological recordings in animals have shown that activity patterns evoked by speech sounds in auditory neurons are well correlated with behavioral discrimination of speech (Engineer et al., 2008; Perez et al., 2012; Centanni et al., 2013).

    • From syntax to acoustic duration: A dynamical model of speech rhythm production

      2007, Speech Communication
      Citation Excerpt :

      Note that the use of this unit in speech research can be traced back at least as far as in Lehiste (1970) and Classe (1939). As regards the relevance of V-to-V units for both speech production and perception, see: the literature on p-centers, e.g. Marcus (1981), Pompino-Marschall (1991); the psycholinguistic advantages of CV transition tracking, e.g. Dogil and Braun (1988); as well as the robustness of CV transition detection throughout the mammalian auditory pathway, e.g. Wong and Schreiner (2003).) The choice of V-to-V instead of VC for the unit name is motivated for two reasons: to avoid the (possible) association of VC with tautosyllabic units only, and to remind the relevance of the vowel flow both for speech production and for the model presented here (see syllabic oscillator in Section 4).

    • Speech Processing in the Auditory System

      2022, Research Advances in Communication Studies
    View all citing articles on Scopus
    View full text