Elsevier

NeuroImage

Volume 70, 15 April 2013, Pages 89-100
NeuroImage

Predictions in speech comprehension: fMRI evidence on the meter–semantic interface

https://doi.org/10.1016/j.neuroimage.2012.12.013Get rights and content

Abstract

When listening to speech we not only form predictions about what is coming next, but also when something is coming. For example, metric stress may be utilized to predict the next salient speech event (i.e. the next stressed syllable) and in turn facilitate speech comprehension. However, speech comprehension can also be facilitated by semantic context, that is, which content word is likely to appear next. In the current fMRI experiment we investigated (1) the brain networks that underlie metric and semantic predictions by means of prediction errors, (2) how semantic processing is influenced by a metrically regular or irregular sentence context, and (3) whether task demands influence both processes. The results are three-fold: First, while metrically incongruent sentences activated a bilateral fronto-striatal network, semantically incongruent trials led to activation of fronto-temporal areas. Second, metrically regular context facilitated speech comprehension in the left-fronto-temporal language network. Third, attention directed to metric or semantic aspects in speech engaged different subcomponents of the left inferior frontal gyrus (IFG). The current results suggest that speech comprehension relies on different forms of prediction, and extends known speech comprehension networks to subcortical sensorimotor areas.

Highlights

► Metric and semantic predictions in speech activate different brain networks. ► A regular metric structure reduces processing costs in critical brain regions. ► Metrically unpredicted words activate sensori-motor areas. ► A neural distinction of metric and semantic predictions can be found in the left IFG.

Introduction

One of the most fascinating features of the human brain is its ability to recognize regular patterns in a dynamically changing environment. For example, speech and music rhythms form recurring temporal patterns of energy fluctuations that optimize their perception. Likewise, perceptual regularity established by a regular rhythm leads to the prediction of upcoming salient auditory events (i.e. sounds, tones, speech units) and to the direction of attention to predicted points in time (Large and Jones, 1999). Ultimately, it has been proposed that prediction, realized in speech by repetitive prosodic information such as a regular rhythm, allows for an economy of resources, leads to faster recognition and facilitates the integration of information (Bubic et al., 2010, Cason and Schön, 2012, Martin, 1972).

In speech comprehension research, the role of rhythm as a source of predictions has only recently benefitted from renewed scientific interest. Traditionally, rhythmic and metric patterns are considered to be phonological properties of speech that shape speech segmentation (Endress and Hauser, 2010, Lee and Todd, 2004), word recognition (e.g. Cutler and Norris, 1988), and language acquisition (e.g. Jusczyk et al., 1999). In the present fMRI study, we pursued the goal to determine whether speech rhythm can also (i) influence the integration of words into a sentence context, and (ii) facilitate integration by utilizing metric and semantic predictions.

Previous research suggests that speech contains regularities encoded in several acoustic features such as pitch, duration, and amplitude that allow to group speech into smaller segments (Dilley and Pitt, 2010, Yoshida et al., 2010). Languages differ in how acoustic regularities are composed, but a ‘phonological grammar’ may be built upon a dynamic temporal system that adapts to acoustic regularities by means of periodic oscillations in speech (Port, 2003) or in communicative interactions (Hasson et al., 2012). Importantly, in stress-time languages such as German, regularities comprise the alternation of strong and weak syllables (i.e. the metric foot) that form the metric structure in speech. Syllable alternations are very prominent and enhance the perception of regularity in speech, thereby forming temporal predictions about when the next prominent syllable will occur (Domahs et al., 2008, Magne et al., 2010, Rothermich et al., 2010). These temporal predictions may arise even if the distribution of stressed syllables does not include precise temporal information (Lidji et al., 2011, Rothermich et al., 2012, Schmidt-Kassow and Kotz, 2009).

In addition to temporal predictions, semantic predictions, derived from knowledge and context, also impact word integration. Some psycholinguistic research suggests that all available linguistic information (e.g. semantic, syntactic, and prosodic) is used to anticipate upcoming events during speech comprehension (e.g., McClelland and Elman, 1986). Consequently, processing ease, even prior to lexical access or selection, will be influenced by information that has been pre-activated in a given context (Federmeier, 2007, Kutas and Federmeier, 2011). Similarly, Delong et al. (2005) proposed that words and phrases differentially trigger future expectancies via rich semantic representations. Taken together, both metric (“when”) and semantic (“what”) predictions influence and facilitate dynamic speech comprehension.

The goal of the current experiment was to investigate how manipulations of metric and semantic predictions impact speech comprehension at the neuronal level. We therefore manipulated the level of metric and semantic expectancy in an auditory sentence context by presenting words with an unexpected stress pattern (metrically incongruent condition) and semantically unexpected words that do not fit in the semantic context (semantically incongruent condition). Additionally, to test the effect of regular speech rhythm on semantic and metric integration we created two different kinds of sentence contexts (metrically regular and metrically irregular) by varying the alternation of stressed and unstressed syllables in words preceding the critical items. If a regular metric structure allows to temporally predict future items, this should lead to reduced activation to semantically or metrically unexpected items particularly in frontal and temporal brain regions (commonly interpreted as facilitation; Obleser and Kotz, 2010, Rossell et al., 2003, Wheatley et al., 2005).

For metrically incongruent compared to congruent trials we expected to see increased activation in an extended fronto-striatal and an extended cortico-cerebello-cortical network (e.g. for speech: Aleman et al., 2005, Geiser et al., 2008, Geiser et al., 2012, Klein et al., 2011, Kotz and Schwartze, 2010; for music: Chen et al., 2008, Grahn and Brett, 2007). This network consists of classical language areas such as the bilateral posterior inferior frontal gyrus (IFG), as well as the bilateral superior temporal gyrus. Next to fronto-temporal regions, regions associated with motor behavior and timing such as the supplementary motor area (SMA), the basal ganglia, the cerebellum, the anterior insula, and the thalamus are implied in the perception and evaluation of rhythm. Second, semantically incongruent trials should increase activation in a left fronto-temporal network (e.g. Bookheimer, 2002, Friederici et al., 2003, Kuperberg et al., 2000, Lau et al., 2008, Newman et al., 2001, Ni et al., 2000, Rissman et al., 2003), including the left IFG, (notably in BA 47/45) as well as the middle temporal gyrus (MTG). More specifically, we hypothesized that the IFG together with the middle temporal cortex subserves top-down processes drawing predictions about incoming information, thereby easing its integration (Friederici, 2011). As already mentioned, by varying the rhythmic properties of sentence context (metrically regular vs. irregular context) we further aimed to test whether regular speech rhythm leads to facilitation effects in classical speech comprehension areas and beyond (see Pickering and Garrod, 2007 for a similar argument on context manipulation) and whether such facilitation varies as a function of attentional task demands.

In summary, we aimed to elucidate how metric and semantic predictions modulate neuronal activation within the speech comprehension network, and how this network may extend to sensorimotor circuitries when focusing on the analysis of the metric/temporal structure of speech. Thus, we hypothesized that a cortico-subcortico-cortical network is involved in the processing of metric and rhythmic characteristics of the auditory speech signal. Manipulation of semantic and metric predictions could also induce common changes in left frontal and temporal cortices revealing areas of integration for both information types (i.e. Geiser et al., 2008, Lau et al., 2008).

Section snippets

Participants

Sixteen right-handed participants (native speakers of German, 8 female, mean age of 26 yrs, S.D. 3,8), with no neurological history took part in the study. They received eight Euros per hour for compensation. After being informed about potential risks and screening by a physician of the Max Planck Institute for Cognitive and Brain Sciences (Leipzig, Germany) participants gave informed and written consent. The experimental procedures were approved by the local ethics committee of the University

Behavioral results

Initial inspection of the behavioral data revealed that one participant showed atypical performance (overall below 40% correct; negative d-prime score), thus not meeting our behavioral threshold (at least 40% correct on average in every condition). This participant was excluded from further analysis.

Overall performance in the semantic task was above 90% correct, however in the metric task performance was rather poor, (metrically incongruent, regular context: 57%; metrically incongruent,

Discussion

The current fMRI study aimed to identify brain areas that (i) respond to semantic and metric prediction errors in both (ii) rhythmically regular or irregular sentence contexts, and under different (iii) attentional (implicit vs. explicit) task demands.

Conclusions

Utilizing metric and semantic prediction errors, the present study investigated the neural correlates of specific predictions in speech comprehension. We report three key findings. First, metric and semantic prediction errors lead to distinct brain activation patterns. While semantic prediction errors activated the left IFG, the left STG as well as the left anterior cingulate, metric prediction errors led to increased activation in the bilateral IFG, the bilateral anterior insula, the bilateral

Acknowledgments

We would like to thank Kerstin Flake for graphics support, as well as Annett Wiedemann, Anke Kummer, and Simone Wipper for their help in data acquisition. We also thank Patrice Voss as well as three anonymous reviewers for their helpful comments on a previous version of the manuscript.

References (118)

  • P. Gagnepain et al.

    Temporal predictive codes for spoken words in auditory cortex

    Curr. Biol.

    (2012)
  • P. Hagoort

    On broca, brain, and binding: a new framework

    Trends Cogn. Sci.

    (2005)
  • S. Heim et al.

    Phonological processing during language production: fMRI evidence for a shared production–comprehension network

    Brain Res.

    (2003)
  • G. Hickok et al.

    Towards a functional neuroanatomy of speech perception

    Trends Cogn. Sci.

    (2000)
  • M. Inase et al.

    Projections from the globus pallidus to the thalamic areas projecting to the dorsal area 6 of the macaque monkey: a multiple tracing study

    Neurosci. Lett.

    (1994)
  • P.W. Jusczyk et al.

    The beginnings of word segmentation in English-learning infants

    Cogn. Psychol.

    (1999)
  • D. Ketteler et al.

    The subcortical role of language processing. high level linguistic features such as ambiguity-resolution and the human brain; an fMRI study

    Neuroimage

    (2008)
  • S.A. Kotz et al.

    Cortical speech processing unplugged: a timely subcortico-cortical framework

    Trends Cogn. Sci.

    (2010)
  • S.A. Kotz et al.

    The basal ganglia are receptive to rhythmic compensation during auditory syntactic processing: ERP patient data

    Brain Lang.

    (2005)
  • S.A. Kotz et al.

    Non-motor basal ganglia functions: a review and proposal for a model of sensory predictability in auditory language perception

    Cortex

    (2009)
  • C. Lee et al.

    Towards an auditory account of speech rhythm: application of a model of the auditory ‘primal sketch’ to two multi-language corpora

    Cognition

    (2004)
  • B.N. Lundstrom et al.

    The role of precuneus and left inferior frontal cortex during source memory episodic retrieval

    Neuroimage

    (2005)
  • F.M. Miezin et al.

    Characterizing the hemodynamic response: effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing

    Neuroimage

    (2000)
  • I. Mutschler et al.

    Functional organization of the human anterior insular cortex

    Neurosci. Lett.

    (2009)
  • S.E. Nadeau et al.

    Subcortical aphasia

    Brain Lang.

    (1997)
  • M.J. Pickering et al.

    Do people use language production to make predictions during comprehension?

    Trends Cogn. Sci.

    (2007)
  • R.F. Port

    Meter and speech

    J. Phon.

    (2003)
  • S.L. Rossell et al.

    The anatomy and time course of semantic priming investigated by fMRI and ERPs

    Neuropsychologia

    (2003)
  • K. Rothermich et al.

    Rhythm's gonna get you: regular meter facilitates semantic sentence processing

    Neuropsychologia

    (2012)
  • N. Cason et al.

    Rhythmic priming enhances the phonological processing of speech

    Neuropsychologia

    (2012)
  • M. Schmidt-Kassow et al.

    Entrainment of syntactic processing? ERP-responses to predictable time intervals during syntactic reanalysis

    Brain Res.

    (2008)
  • H. Ackermann et al.

    Rate-dependent activation of a prefrontal–insular–cerebellar network during passive listening to trains of click stimuli: an fMRI study

    Neuroreport

    (2001)
  • H. Ackermann et al.

    The contribution of the cerebellum to speech production and speech perception: clinical and functional imaging data

    Cerebellum

    (2007)
  • D. Akkal et al.

    Supplementary motor area and presupplementary motor area: targets of basal ganglia and cerebellar output

    J. Neurosci.

    (2007)
  • A. Aleman et al.

    The functional neuroanatomy of metrical stress evaluation of perceived and imagined words

    Cereb. Cortex

    (2005)
  • G. Alexander et al.

    Parallel organization of functionally segregated circuits linking basal ganglia and cortex

    Annu. Rev. Neurosci.

    (1986)
  • D. Badre et al.

    Is the rostro-caudal axis of the frontal lobe hierarchical?

    Nat. Rev. Neurosci.

    (2009)
  • D.E. Bamiou et al.

    Auditory temporal processing deficits in patients with insular stroke

    Neurology

    (2006)
  • S. Bookheimer

    Functional MRI of language: new approaches to understanding the cortical organization of semantic processing

    Annu. Rev. Neurosci.

    (2002)
  • A. Bubic et al.

    Prediction, cognition and the brain

    Front. Hum. Neurosci.

    (2010)
  • C.V. Buhusi et al.

    What makes us tick? Functional and neural mechanisms of interval timing

    Nat. Rev. Neurosci.

    (2005)
  • H. Burton et al.

    Dissociating cortical regions activated by semantic and phonological tasks: an fMRI study in blind and sighted people

    J. Neurophysiol.

    (2003)
  • H.L. Chapin et al.

    Neural responses to complex auditory rhythms: the role of attending

    Front. Psychol.

    (2010)
  • J.L. Chen et al.

    Listening to musical rhythms recruits motor regions of the brain

    Cereb. Cortex

    (2008)
  • T.A. Christensen et al.

    Cortical and subcortical contributions to the attentive processing of speech

    Neuroreport

    (2008)
  • A.D.B. Craig

    Emotional moments across time: a possible neural basis for time perception in the anterior insula

    Philos. Trans. R. Soc. Lond. B Biol. Sci.

    (2009)
  • A. Cutler et al.

    The role of strong syllables in segmentation for lexical access

    J. Exp. Psychol. Hum. Percept. Perform.

    (1988)
  • K.A. DeLong et al.

    Probabilistic word pre-activation during language comprehension inferred from electrical brain activity

    Nat. Neurosci.

    (2005)
  • L.C. Dilley et al.

    Altering context speech rate can cause words to appear or disappear

    Psychol. Sci.

    (2010)
  • U. Domahs et al.

    The processing of German word stress: evidence for the prosodic hierarchy

    Phonology

    (2008)
  • Cited by (45)

    • Discourse management during speech perception: A functional magnetic resonance imaging (fMRI) study

      2019, NeuroImage
      Citation Excerpt :

      We assume that the activation within the pre-SMA and the IFG serves to accomplish executive functions required by the task of this study. Specifically, the evaluation processes associated with the acceptability rating, as well as the task of keeping the sentence content available during discourse processing by means of inner speech (Friederici, 2012; Hertrich et al., 2016; Matchin et al., 2017; Rothermich and Kotz, 2013) most probably required executive functions (see e.g., Alderson-Day and Fernyhough, 2015, for a review). In addition, the activation of the cerebellum supports the function of the pre-SMA in discourse processing.

    View all citing articles on Scopus
    View full text