Abstract
We consider the use of natural pauses to aid analysis of spontaneous speech, studying four Japanese dialogues concerning a simulated direction-finding task. Using new techniques, we added to existing transcripts information concerning the placement and length of significant pauses within turns (breathing intervals of any length or silences longer than approximately 400 milliseconds). We then addressed four questions: (1) Are “pause units” (segments bounded by natural pauses) reliably shorter than utterances? The answer was Yes: on average, pause units in our corpus were on average 5.89 Japanese morphemes long, 60% the length of whole utterances, with much less variation. (2) Would hesitation expressions yield shorter units if used as alternate or additional boundaries? The answer was Not much, apparently because pauses and hesitation expressions often coincide. We found no combination of expressions which gave segments as much as one morpheme shorter than pause units on average. (3) How well-formed are pause units from a syntactic viewpoint? We manually judged that 90% of the pause units in our corpus could be parsed with standard Japanese grammars once hesitation expressions had been filtered from them. (4) Does translation by pause unit deserve further study? The answer was Yes, in that a majority of the pause units in four dialogues gave understandable translations into English when translated by hand. We are thus encouraged to further study a “divide and conquer” analysis strategy, in which parsing and perhaps translation of pause units is carried out before, or even without, attempts to create coherent analyses of entire utterances.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
Bibliography
Ehara, T., K. Ogura, and T. Morimoto. 1991. “Contents and structure of the ATR bilingual database of spoken dialogues.” In ACH/ALLC, pages 131–136.
Ehara, T., K. Ogura, and T. Morimoto. 1990. “ATR dialogue database.” In Proceedings of ICSLP, pages 1093–1096.
Furukawa, R., F. Yato, and K. Loken-Kim. Analysis of telephone and multimedia dialogues. Technical Report TR-IT-0020, ATR, Kyoto. (in Japanese)
Hosaka, J. and T. Takezawa. 1992. “Construction of corpus-based syntactic rules for accurate speech recognition.” In Proceedings of COLING 1992, pages 806–812, Nantes.
Hosaka, J. 1993. A grammar for Japanese generation in the TUG framework. Technical Report TR-I-0346, ATR, Kyoto. (in Japanese).
Loken-Kim, K., F. Yato, K. Kurihara, L. Fais, and R. Furukawa. 1993. EMMI-ATR environment for multi-modal interaction. Technical Report TR-IT-0018, ATR, Kyoto, (in Japanese).
Morimoto, T., T. Takezawa, F. Yato, et al. 1993. “ATR's speech translation system: ASURA.” Proceedings of Eurospeech-93, Vol 2., pp. 1291–1294.
Takezawa, T. et al. 1995. A Japanese grammar for spontaneous speech recognition based on subtrees. Technical Report TR-IT-0110, ATR, Kyoto.
Tomokiyo, M., M. Seligman, and L. Fais. 1996. “Using Communicative Acts to analyze spoken dialogues.” Draft.
Xwaves93.1993. Entropic Research Laboratory, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seligman, M., Hosaka, J., Singer, H. (1997). “Pause units” and analysis of spontaneous Japanese dialogues: Preliminary studies. In: Maier, E., Mast, M., LuperFoy, S. (eds) Dialogue Processing in Spoken Language Systems. DPSLS 1996. Lecture Notes in Computer Science, vol 1236. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63175-5_40
Download citation
DOI: https://doi.org/10.1007/3-540-63175-5_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63175-0
Online ISBN: 978-3-540-69206-5
eBook Packages: Springer Book Archive