A Typology of Spontaneous Speech

Beckman, Mary E.

doi:10.1007/978-1-4612-2258-3_2

Mary E. Beckman

343 Accesses
10 Citations

Abstract

Building accurate computational models of the prosody of spontaneous speech is a daunting enterprise because speech produced without a carefully devised written script does not readily allow the explicit control and repeated observation that read “lab speech” corpora are designed to provide. The prosody of spontaneous speech is affected profoundly by the social and rhetorical context of the recording, and these contextual factors can themselves vary widely in ways beyond our current understanding and control, so that there are many types of spontaneous speech which differ substantially not just from lab speech but also from each other. This paper motivates the study of spontaneous speech by describing several important aspects of prosody and its function that cannot be studied fully in lab speech, either because the relevant phenomena do not occur at all in lab speech or occur in a limited range of types. It then lists and characterizes some kinds of spontaneous speech that have been successfully recorded and analysed by scientists working on some of these aspects of prosody or on related discourse phenomena.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. H. Anderson, M. Bader, E. G. Bard, E. Boyle, G. Docherty, S. Garrod, S. Isard, J. Kowtko, J. McAllister, J. Miller, H. Thompson, and R. Weinert. The HCRC map task corpus. Language and Speech, 34:351–366, 1991.
Google Scholar
G. Ayers, G. Bruce, B. Granstrom, K. Gustafson, M. Home, D. House, and P. Touati. Modelling intonation in dialogue. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 2, pp. 278–281, 1995.
Google Scholar
C. Avesani, J. Hirschberg, and P. Prieto. The intonational disambiguation of potentially ambiguous utterances in English, Italian, and Spanish. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 1, pp. 174–177, 1995.
Google Scholar
A. Arvaniti. Review of Stress and prosodic structure in Greek: A phonological, physiological and perceptual study, by A. Botinis. Journal of Phonetics, 18:65–69, 1990.
Google Scholar
J. Azuma and Y. Tsukuma. Role of F0 and pause in disambiguating syntactically ambiguous Japanese sentences. In Proceedings of the XIIème International Congress of Phonetic Sciences, Aix-en-Provence, France, Vol. 3, pp. 274–277, 1991.
Google Scholar
C. Avesani. A contribution to the synthesis of Italian intonation. In Proceedings of the International Conference on Spoken Language Processing, Kobe, Japan, Vol. 2, pp. 833–836, 1990.
Google Scholar
G. Ayers. Discourse functions of pitch range in spontaneous and read speech. OSU Working Papers in Linguistics, 44:1–49, 1994.
Google Scholar
G. M. Ayers. Nuclear accent types and prominence: some psycholinguist ic experiments. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 3, pp. 660–663, 1995.
Google Scholar
G. Brown, K. L. Currie, and J. Kenworthy. Questions of Intonation. Croom Helm, 1980.
Google Scholar
M. E. Beckman and J. Edwards. Articulatory evidence for differentiating stress categories. In P. A. Keating, editor, Phonological Structure and Phonetic Form: Papers in Laboratory Phonology III, pp. 7–33. Cambridge, UK: Cambridge University Press, 1994.
Google Scholar
G. Bruce, B. Granström, K. Gustafson, and D. House. Aspects of prosodie phrasing in Swedish. In Proceedings of the International Conference on Spoken Language Processing, Kobe, Japan, Vol. 1, pp. 109–112, 1992.
Google Scholar
G. Bruce. Swedish Word Accents in Sentence Perspective. Lund: Gleerup, 1977.
Google Scholar
G. Bruce. Developing the Swedish intonational model. Working Papers, Lund University, 22:51–116, 1982.
Google Scholar
M. E. Beckman, M. G. Swora, J. Rauschenberg, and K. de Jong. Stress shift, stress clash, and polysyllabic shortening in a prosodically annotated discourse. In Proceedings of the International Conference on Spoken Language Processing, Kobe, Japan, Vol. 1, pp. 5–8, 1990.
Google Scholar
W. E. Cooper, S. J. Eady, and P. R. Mueller. Acoustical aspects of contrastive stress in question-answer contexts. J. Acoust. Soc. Am., 77:2142–2156, 1986.
Article ADS Google Scholar
W. L. Chafe. The Pear Stories: Cognitive, Cultural and Linguistic Aspects of Narrative Production. Norwood, NJ: Ablex, 1980.
Google Scholar
H. J. Cedergren and L. Simoneau. La chute des voyelles hautes en français de Montréal: Às-tu entendu la belle syncope?. In M. Lemieux and H. J. Cedergren, editors, Les Tendences Dynamiques du Français Parlé á Montreál, pp. 57–144. Montreal: Office de la Langue Française, 1985.
Google Scholar
R. Collier and J. ‘t Hart. Cursus Nederlandse Intonatie. Leuven: Acco, 1981.
Google Scholar
K. de Jong. The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. J. Acoust Soc. Am., 97:491–504, 1995.
Article ADS Google Scholar
J. Esser and A. Polomski. Comparing Reading and Speaking Intonation. Amsterdam: Rodopi, 1988.
Google Scholar
C. A. Fowler and J. Housum. Talkers’ signalling of ‘new’ and ‘old’ words in speech, and listeners’ perception and use of the distinction.Journal of Memory & Language, 26:489–504, 1987.
Article Google Scholar
J. Fletcher. Rhythm and lengthening in French.Journal of Phonetics, 19:193–212, 1991.
Google Scholar
L. Fais, K. Loken-Kim, and Y-D. Park. Speakers’ responses to requests for repetition in a multimedia language processing environment. Proceedings of the International Conference on Cooperative Multimodal Communication, pp. 129–144, 1995.
Google Scholar
B. A. Fox. Discourse Structure and Anaphora: Written and Conversational English. Cambridge, UK: Cambridge University Press, 1987.
Book Google Scholar
B. Grosz and J. Hirschberg. Some intonational characteristics of discourse structure. In Proceedings of the International Conference on Spoken Language Processing, Banff, Canada, Vol. 1, pp. 429–432, 1992.
Google Scholar
C. Gussenhoven and A. C. M. Rietveld. Fundamental frequency declination in Dutch: Testing three hypotheses.Journal of Phonetics, 16:355–369, 1988.
Google Scholar
B. Grosz and C. Sidner. Attention, intentions, and the structure of discourse. Computational Linguistics, 12:175–204, 1986.
Google Scholar
R. Geluykens and M. Swerts. Local and global prosodic cues to discourse organization in dialogues. Working Papers 41, Proc. ESCA Workshop on Prosody, Lund University, Sweden, pp. 108–111, 1993.
Google Scholar
M. Grice and M. Savino. Low tone versus ‘sag’ in Bari Italian intonation: A perceptual experiment. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 4, pp. 658–661, 1995.
Google Scholar
D. Hindle. Deterministic parsing of syntactic nonfluencies. Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, pp. 123–128, 1983.
Google Scholar
J. Hirschberg and D. Litman. Now let’s talk about now: Identifying cue phrases intonationally. Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, pp. 163–171, 1987.
Google Scholar
J. Hirschberg and C. Nakatani. A speech-first model for repair identification in spoken language systems. Proceedings of the European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1173–1176, 1993.
Google Scholar
Y. Homma. The rhythm of Tanka, short Japanese poems; read in prose style and in contest style. In Proceedings of the XIIème International Congress of Phonetic Sciences, Aix-en-Provence, France, Vol. 2, pp. 314–317, 1991.
Google Scholar
J. Hirschberg and J. Pierrehumbert. The intonational structuring of discourse. Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, pp. 136–144, 1986.
Google Scholar
S. A. Jun and M. Oh. A prosodic analysis of three sentence types with ‘wh’ words in Korean. In Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, Vol. 1, pp. 323–326, 1994.
Google Scholar
A. Kießling, R. Kompe, H. Niemann, E. Nöth, and A. Batliner. Roger, sorry, I’m still listening: Dialog guiding signals in information retrieval dialogs. Working Papers 41, Proceedings of the ESCA Workshop on Prosody, Lund University, Sweden, pp. 140–143, 1993.
Google Scholar
K. J. Kohler. Categorical pitch perception. In Proceedings of the 11th International Congress of Phonetic Sciences, Tallin, Estonia, Vol. 5, pp. 331–333, 1987.
Google Scholar
S. Kori. Nihongo bun’ontyoo no kenkyuu kadai. Paper presented at the International Symposium on Prosody, 1992.
Google Scholar
D. R. Ladd. The Structure of Intonational Meaning. Bloomington: Indiana University Press, 1980.
Google Scholar
I. Lehiste. Phonetic disambiguation of syntactic ambiguity. Glossa, 7:107–122, 1973.
Google Scholar
I. Lehiste. The phonetic structure of paragraphs. In A. Cohen and S. Nooteboom, editors, Structure and Process in Speech Perception, pp. 195–203. Heidelberg: Springer, 1975.
Chapter Google Scholar
D. R. Ladd, K. E. A. Silverman, F. Tolkmitt, G. Bergmann, and K. R. Scherer. Evidence for the independent function of intonation contour type, voice quality, and F₀ range in signaling speaker affect. J. Acoust. Soc. Am., 78:435–444, 1985.
Article ADS Google Scholar
D. R. Ladd, J. Verhoeven, and K. Jacobs. Influence of adjacent pitch accents on each other’s perceived prominence, two contradictory effects.Journal of Phonetics, 22:87–99, 1994.
Google Scholar
L. Hirschman. MADCOW. Multi-site data collection for a spoken language corpus. Proceedings DARPA Speech and Natural Language Workshop, pp. 7–14, 1992.
Google Scholar
I. R. Murray and J. L. Arnott. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. J. Acoust. Soc. Am., 93:1097–1108,1993.
Article ADS Google Scholar
K. Maekawa. Perception of intonational characteristics of wh and non-wh questions in Tokyo Japanese. In Proceedings of the XIIème International Congress of Phonetic Sciences, Aix-en- Provence, France, Vol. 4, pp. 202–205, 1991.
Google Scholar
S. Nakajima and J. Allen. A study on prosody and discourse structure in cooperative dialogues. Phonetica, 50:197–210,1993.
Article Google Scholar
M. H. O’Malley, D. R. Kloker, and D. Dara-Abrams. Recovering parentheses from spoken algebraic expressions. IEEE Trans. Audio and Electroacoustics, AU-21:217–220, 1973.
Article Google Scholar
J. B. Pierrehumbert and M. E. Beckman. Japanese Tone Structure. Cambridge, MA: MIT Press, 1988.
Google Scholar
J. Pitrelli, M. E. Beckman, and J. Hirschberg. Evaluation of prosodic transcription labelling reliability in the ToBI framework. In Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, Vol. 1, pp. 123–126, 1994.
Google Scholar
J. Pierrehumbert and J. Hirschberg. The meaning of intonation contours in the interpretation of discourse. In P. R. Cohen, J. Morgan, and M. E. Pollack, editors, Intentions in Communication, pp. 271–311. Cambridge, MA: MIT Press, 1990.
Google Scholar
R. Passonneau and D. Litman. Feasibility of automated discourse segmentation. Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pp. 148–163, 1993.
Google Scholar
M. Rooth. A theory of focus interpretation. Natural Language Semantics, 1:75–116, 1992.
Article Google Scholar
K. E. A. Silverman, E. Blaauw, J. Spitz, and J. Pitrelli. Towards using prosody in speech recognition/understanding systems: differences between read and spontaneous speech. Proceedings DARPA Speech and Natural Language Workshop, pp. 435–440, 1992.
Google Scholar
D. Schaffer. The role of intonation as a cue in turn taking in conversation.Journal of Phonetics, 11:243–344, 1983.
Google Scholar
K. R. Scherer. Vocal affect expression: a review and model for future research. Psychological Bulletin, 99:143–165, 1986.
Article Google Scholar
K. R. Scherer. How emotion is expressed in speech and singing. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 3, pp. 90–96, 1995.
Google Scholar
C-L. Shih. Tone and intonation in Mandarin. Working Papers, Cornell Phonetics Laboratory, 3:83–109, 1988.
Google Scholar
K. E. A. Silverman, A. Kalyanswamy, J. Silverman, S. Basson, and D. Yashchin. Synthesizer intelligibility in the context of a name-and-address information service. Proceedings of the European Conference on Speech Communication and Technology, Berlin, Germany, pp. 2169–2172, 1993.
Google Scholar
E. E. Shriberg and R. J. Lickley. Intonation of clause-internal filled pauses. Phonetica, 50:172–179, 1993.
Article Google Scholar
E. Strangert. Perceived pauses, silent intervals, and syntactic boundaries. PHONUM, 1:35–38, 1993.
Google Scholar
M. Swerts. On the prosodie prediction of discourse finality. Working Papers 41? Proceedings of the ESC A Workshop on Prosody, Lund University, Sweden, pp. 96–99, 1993.
Google Scholar
M. Swerts. Combining statistical and phonetic analyses of spontaneous discourse segmentation. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 4, pp. 208–211, 1995.
Google Scholar
Y. Tsukuma and J. Azuma. Prosodie features determining the comprehension of syntactically ambiguous sentences in Mandarin Chinese. In Proceedings of the International Conference on Spoken Language Processing, Kobe, Japan, Vol. 1, pp. 505–508, 1990.
Google Scholar
J. M. B. Terken. The distribution of pitch accents in instructions as a function of discourse structure. Language & Speech, 27:269–289, 1984.
Google Scholar
P. Touati. Structure Prosodiques du Suëdois et du Français. Lund: Lund University Press, 1987.
Google Scholar
P. Touati. Prosodie aspects of political rhetoric. Working Papers 41, Proceedings of the ESCA Workshop on Prosody, Lund University, Sweden, pp. 168–171, 1993.
Google Scholar
P. Touati. Pitch range and register in French political speech. In Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, Vol. 4, pp. 244–247, 1995.
Google Scholar
J. Tsumaki. Intonational properties of adverbs in Tokyo Japanese. In Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, Vol. 4, pp. 1727–1730, 1994.
Google Scholar
T. Uyeno, H. Hayashibe, K. Imai, H. Imagawa, and S. Kiritani. Syntactic structures and prosody in Japanese. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo, 15:91–108, 1981.
Google Scholar
J. J. Venditti. The influence of syntax on prosodic structure in Japanese. In Working Papers in Linguistics, Vol. 44, pp. 191–223. The Ohio State University, 1994.
Google Scholar
C. van Wijk and G. Kempen. A dual system for producing self- repairs in spontaneous speech: Evidence from experimentally elicited corrections. Cognitive Psychology, 19:403–440, 1987.
Article Google Scholar
J. J. Venditti and H. Yamashita-Butler. Prosodic information and processing of temporarily ambiguous constructions in Japanese. In Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, Vol. 3, pp. 1147–1150, 1994.
Google Scholar
G. Ward and J. Hirschberg. Implicating uncertainty: The pragmatics of fall-rise intonation. Language 61: 747–776, 1985.
Article Google Scholar
A. Woodbury. Against intonational phrases in Central Alaskan Yupik Eskimo. Paper presented at the annual meeting of the Linguistic Society of America, Los Angeles, CA, 1993.
Google Scholar

Download references

Authors

Mary E. Beckman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ATR Interpreting Telecommunications Research Labs, 2-2, Hikaridai, Seika-cho, Soraku-gun, 619-02, Kyoto, Japan
Yoshinori Sagisaka , Nick Campbell & Norio Higuchi , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Beckman, M.E. (1997). A Typology of Spontaneous Speech. In: Sagisaka, Y., Campbell, N., Higuchi, N. (eds) Computing Prosody. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2258-3_2

Download citation

DOI: https://doi.org/10.1007/978-1-4612-2258-3_2
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7476-6
Online ISBN: 978-1-4612-2258-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics