Abstract
This paper describes an implemented computational model that generates intonation contours for dialogue systems. We concentrate on the relationship between pragmatics and two aspects of intonation: pitch range and pitch accent placement. Pitch range is computed based on the position of an utterance in the discourse structure: utterances that introduce a new topic have an expanded register compared to utterances that continue a topic. Pitch accent placement is based on two pragmatic factors: cognitive status (what the speaker assumes the hearer is attending to) and informativeness (what the speaker assumes to be the interesting or informative component of a phrase). This work suggests that even simple models of discourse topic structure, cognitive status, and informativeness will lead to improved register determination and pitch accent placement in practical conversational systems.
Similar content being viewed by others
References
Allerton, D.J. (1978). The notion of ‘givenness’ and its relations to presupposition and to theme.Lingua, 44:133–168.
Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S., Isard, S., Kowtko, J., Mcallister, J., Miller, J., Sotillo, C., Thompson, H.S., and Weinert, R. (1991). The HCRC Map Task Corpus.Language and Speech, 34(4):351–366.
Bolinger, D. (1972). Accent if predictable (if you're a mind reader).Language, 48(3):633–644.
Bolinger, D. (1986).Intonation and Its Parts: Melody in Spoken English. Stanford: Stanford University Press.
Bolinger, D. (1989).Intonation and Its Uses: Melody in Grammar and Discourse. Standford: Stanford University Press.
Bosch, P. (1988). Representing and accessing focussed referents.Language and Cognitive Processes, 3(3):207–231.
Brown, G., Currie, K. L., and Kenworthy, J. (1980).Questions of Intonation. London: Croom Helm.
Campbell, W.N. (1992). Multi-level timing in speech. Sussex University dissertation.
Campbell, W.N., Isard, S.D., Monaghan, A.I.C., and Verhoeven, J. (1990). Duration, pitch, and diphones in the CSTR TTS system.Proceedings of the International Conference on Spoken Language Processing, pp. 825–828.
Carletta, J. (1990). Modeling variations in goal-directed dialogue.Proceedings of the International Conference on Computational Linguistics, 13:324–326.
Chafe, W. (1976). Givenness, Contrastiveness, subjects, topics, and point of view. In C.N. Li (Ed.),Subject and Topic. New York: Academic Press, pp. 25–55.
Chomsky, N. (1971). Deep structure, surface structure, and semantic interpretation. In R. Jacobs and P. Rosenbaum (Eds.),Semantics. Cambridge: Cambridge University Press, pp. 183–216.
Clark, H.H. and Wilkes-Gibbs, D. (1990). Referring as a collaborative process. In P.R. Cohen, J. Morgan, and M.E. Pollack (Eds.),Intentions in Communication, Boston: MIT Press, pp. 463–493.
Dale, R. (1989). Cooking up references.Proceedings of the Annual Meeting of the Association of Computational Linguistics, vol. 27, pp. 68–75.
Dale, R. and Reiter, E. (1995). Computational interpretations of the Gricean Maxims in the generation of referring expressions.Cognitive Science, 19(2):233–263.
Davis, J.R. and Hirschberg, J. (1988). Assigning intonational features in synthesized spoken directions.Proceedings of the Annual Meeting of the Association for Computational Linguistics, 26:187–193.
Faber, D. (1987). Some problems of English nucleus placement. University of Manchester dissertation.
Firbas, J. (1972). On the interplay of prosodic and non-prosodic means of functional sentence perspective (A theoretical note on the teaching of English intonation). In V. Fried (Ed.),The Prague School of Linguistics and Language Teaching. London: Oxford University Press, pp. 77–94.
Fowler, C.A. and Housum, J. (1987). Talkers' signalling of “new” and “old” words in speech and listeners' perception and use of the distinction.Journal of Memory and Language, 26:489–504.
Green, G.M. (1989).Pragmatics and Natural Language Understanding. Hillsdale, NJ: Lawrence Erlbaum Associates.
Grice, H.P. (1975). Logic and conversation. In P. Cole and J. Morgan (Eds.),Speech Acts. New York: Academic Press, pp. 41–58.
Grosz, B.J. and Sidner, C.L. (1986). Attention, intentions, and the structure of discourse.Computational Linguistics, 12 (3):175–204.
Gundel, J.K. (1974). The Role of Topic and Comment in Linguistic Theory, University of Texas dissertation.
Gundel, J.K. (1978).Stress, Pronominalization and the Given-New Distinction, University of Hawaii working papers in linguistics NTIS,10(2):1–13.
Gundel, J.K. (1988). Universals of topic-comment structure. In M. Hammond, E.A. Moravcsik, and J.R. Wirth (Eds),Studies in Syntactic Typology. Amsterdam: John Benjamins Publishing, pp. 209–239.
Gundel, J.K., Hedberg, N., and Zacharski, R. (1993). Cognitive status and the form of referring expressions in discourse.Language, 69:274–307.
Gundel, J.K., Hedberg, N., and Zacharski, R. (1995). Prosodic tune and information structure.Proceedings of the Annual Meeting of the Canadian Linguistic Association.
Gussenhoven, C. (1983). Focus, mode, and the nucleus.Journal of Linguistics, 19:377–417.
Heim, I.R. (1982). The Semantics of Definite and Indefinite Noun Phrases. University of Massachusetts dissertation.
Hirschberg, J. (1990). Accent and discourse context: assigning pitch accent in synthetic speech.Proceedings of the National Conference on Artificial Intelligence, 8(2):952–957.
Hirschberg, J. (1992). Using discourse context to guide pitch accent decisions in synthetic speech. In G. Bailly, C. Benoit and T.R. Sawallis (Eds),Talking Machines: Theories, Models, and Designs. Amsterdam, Elsevier Science Publishers B.V., pp. 367–376.
Hirschberg, J. (1993a). Pitch accent in context: predicting intonational prominence from text.Artificial Intelligence, 63(1):305–340.
Hirschberg, J. (1993b). Studies of intonation and discourse.Proceedings of the ESCA Workshop on Prosody, vol. 41, pp. 90–95.
Hirschberg, J. and Grosz, B. (1992). Intonational features of local and global discourse structure.Proceedings of the Speech and Natural Language Workshop, pp. 441–446.
Hirschberg, J. and Pierrehumbert, J. (1986). The intonational structuring of discourse.Proceedings of the Annual Meeting of the Association of Computational Linguistics, 24:136–144.
Hirschberg, J. and Ward, G. (1992). The influence of pitch-range, duration, amplitude, and spectral features on the interpretation of rise-fall-rise intonation patterns in English.Journal of Phonetics, 20:241–252.
Hockett, C.A. (1958).A Course in Modern Linguistics. New York: Macmillan.
Hultzén, L.S. (1956). The poet Burns' again.American Speech, 31:195–201.
Jackendoff, R.S. (1972).Semantic Interpretation in Generative Grammar. Cambridge: MIT Press.
Ladd, D.R. (1980).The Structure of Intonational Meaning. Bloomington and London: Indiana University Press.
Ladd, D.R. (1987). A model of intonational phonology for use in speech synthesis by rule.Proceedings of the European Conference on Speech Technology, pp. 21–24.
Lakoff, G. (1971). Presupposition and relative well-formedness. In D.D. Steinberg and L.A. Jakobovits (Eds.),Semantics: An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge University Press, pp. 329–340.
Lambrecht, K. (1992). Sentential-focus structures as grammatical constructions. Paper presented at the Linguistic Society of America Annual Meeting, ms.
Levelt, W.J.M. (1989).Speaking: From Intention to Articulation. Cambridge, MA: MIT Press.
Linde, C. (1979). Focus of attention and the choice of pronouns in discourse. In T. Givón (Ed.),Discourse and Syntax. New York: Academic Press, pp. 337–354.
Luce, P.A., Fuestel, T.C., and Pisoni, D.B. (1983). Capacitity demands in short-term memory for synthetic and natural speech.Human Factors, 25(1):17–32.
Monaghan, A.I.C. (1991). Intonation in a text-to-speech conversion system. University of Edinburgh dissertation.
O'Connell, D.C., Turner, E.A., and Onuska, L.A. (1968). Intonation, grammatical structure, and contextual association in immediate recall.Journal of Verbal Learning and Verbal Behavior, 7:110–116.
Passonneau, R.J. (1995). Integrating Gricean and Attentional Constraints.Proceedings of the International Joint Conference on Artificial Intelligence, 14:1267–1273.
Pierrehumbert, J.B. (1981). Synthesizing intonation.Journal of the Acoustic Society of America, 70(4):985–995.
Prevost, S. and Steedman, M. (1994). Specifying intonation from context for speech synthesis.Speech Communication, 15:139–153.
Prince, E.F. (1986). On the syntactic marking of presupposed open propositions.Chicago Linguistic Society, 22:208–222.
Reichman, R. (1985).Getting Computers to Talk Like you and me: Discourse Context, Focus, and Semantics (An ATN Model). Cambridge, MA: MIT Press.
Schmerling, S.F. (1976).Aspects of English Sentence Stress. Austin: University of Texas Press.
Silverman, K. (1987). The structure and processing of fundamental frequency contours. Cambridge University dissertation.
Slowiaczek, L.A. and Nusbaum, H.C. (1985). Effects of speech rate and pitch contour on the perception of synthetic speech.Human Factors, 27(6):701–711.
Steedman, M. (1991). Structure and intonation.Language, 67:260–296.
Thorsen, N. (1985). Intonation and text in Standard Danish.Journal of the Acoustical Society of America, 77:1205–1216.
Vallduví, E. (1990). The informational component. University of Pennsylvania dissertation.
Vallduví, E. (1993), Information packaging: A survey, ms.
Vallduví, E. and R. Zacharski (1994). Accenting phenomena, association with focus, and the recursiveness of focus-ground.Proceedings of the Amsterdam Colloquium, 9:683–702.
van Oosten, J. (1986). The nature of subjects, topics and agents: A cognitive explanation. University of California dissertation.
Ward, G.L. (1985). The semantics and pragmatics of preposing. University of Pennsylvania dissertation.
Youd, N. and House, J. (1991). Generating intonation in a voice dialogue systesm.European Conference on Speech Technology.3:1287–1290.
Zacharski, R.A. (1993). A discourse pragmatics model of pitch accent in English. University of Minnesota dissertation.
Rights and permissions
About this article
Cite this article
Delin, J., Zacharski, R. Pragmatic determinants of intonation contours for dialogue systems. Int J Speech Technol 1, 109–120 (1997). https://doi.org/10.1007/BF02277192
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02277192