Skip to main content

Overlap in Meetings: ASR Effects and Analysis by Dialog Factors, Speakers, and Collection Site

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4299))

Included in the following conference series:

Abstract

We analyze speaker overlap in multiparty meetings both in terms of automatic speech recognition (ASR) performance, and in terms of distribution of overlap with respect to various factors (collection site, speakers, dialog acts, and hot spots). Unlike most previous work on overlap or crosstalk, our ASR error analysis uses an approach that allows comparison of the same foreground speech with and without naturally occurring overlap, using a state-of-the-art meeting recognition system. We examine a total of 101 meetings. For analysis of ASR, we use 26 meetings from the NIST meeting transcription evaluations, and discover a number of interesting phenomena. First, overlaps tend to occur at high-perplexity regions in the foreground talker’s speech. Second, overlap regions tend to have higher perplexity than those in nonoverlaps, if trigrams or 4-grams are used, but unigram perplexity within overlaps is considerably lower than that of nonoverlaps. Third, word error rate (WER) after overlaps is consistently lower than that before the overlap, apparently because the foreground speaker reduces perplexity shortly after being overlapped. These appear to be robust findings, because they hold in general across meetings from different collection sites, even though meeting style and absolute rates of overlap vary by site. Further analyses of overlap with respect to speakers and meeting content were conducted on a set of 75 additional meetings collected and annotated at ICSI. These analyses reveal interesting relationships between overlap and dialog acts, as well as between overlap and “hot spots” (points of increased participant involvement). Finally, results from this larger data set show that individual speakers have widely varying rates of being overlapped.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ang, J., Liu, Y., Shriberg, E.: Automatic Dialog Act Segmentation and Classification in Multi-party Meetings. In: Proc. Intl. Conf. on Acoustic, Speech and Signal Processing, pp. 1061–1064 (2005)

    Google Scholar 

  2. Clark, A., Popescu-Belis, A.: Multi-level Dialogue Act Tags. In: SIGdial Workshop on Discourse and Dialogue, pp. 163–170 (2004)

    Google Scholar 

  3. Cooke, M., Ellis, D.P.W.: The Auditory Organization of Speech and Other Sources in Listeners and Computational Models. Speech Communication 35, 141–177 (2001)

    Article  MATH  Google Scholar 

  4. Çetin, Ö., Stolcke, A.: Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System, Technical Report TR-05-006, ICSI (2005)

    Google Scholar 

  5. Çetin, Ö., Shriberg, E.E.: Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap. In: Proc. Intl. Conf. on Acoustic, Speech and Signal Processing (2006)

    Google Scholar 

  6. Dhillon, R., Bhagat, S., Carvey, H., Shriberg, E.: Meeting Recorder Project: Dialog Act Labeling Guide, Technical Report TR-04-002, ICSI (2004)

    Google Scholar 

  7. Jefferson, G.: A Sketch of Some Orderly Aspects of Overlap in Natural Conversation. In: Lerner, G.H. (ed.) Conversation Analysis, pp. 43–59. John Benjamins, Amsterdam (2004)

    Google Scholar 

  8. Ji, G., Bilmes, J.: Dialog Act Tagging Using Graphical Models. In: Proc. Intl. Conf. on Acoustics, Speech and Signal Process, pp. 33–36 (2005)

    Google Scholar 

  9. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI Meeting Corpus. In: Proc. Intl. Conf. on Acoustics, Speech and Signal Process, pp. 364–367 (2003)

    Google Scholar 

  10. Morgan, N., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Janin, A., Pfau, T., Shriberg, E., Stolcke, A.: The Meeting Project at ICSI. In: Proc. Human Language Technologies Conf., pp. 1–7 (2001)

    Google Scholar 

  11. NIST Speech Evaluations, http://www.nist.gov/speech/tests/index.htm

  12. Pfau, T., Ellis, D., Stolcke, A.: Multispeaker Speech Activity Detection for the ICSI Meeting Recorder. In: Proc. Automatic Speech Recognition and Understanding Workshop, pp. 107–110 (2001)

    Google Scholar 

  13. Sacks, H., Schegloff, E., Jefferson, G.: A Simplest Semantics for the Organization of the Turn-taking in Conversation. Language 50, 696–735 (1974)

    Article  Google Scholar 

  14. Schegloff, E.: Recycled Turn Beginnings: A precise repair mechanism in conversation’s turn-taking organisation. In: Button, G., Lee, J.R.E. (eds.) Talk and Social Organisation, pp. 70–85. Clevadon (1987)

    Google Scholar 

  15. Schegloff, E.: Overlapping Talk and the Organization of Turn-Taking for Conversation. Language in Society 29, 696–735 (2000)

    Article  Google Scholar 

  16. Schultz, R.T., Waibel, A., Bett, M., Metze, F., Pan, Y., Ries, K., Schaaf, T., Soltau, H., Westphal, M., Yu, H., Zechner, K.: The ISL Meeting Room System. In: Proc. Workshop on Hands-Free Speech Communication (2001)

    Google Scholar 

  17. Shriberg, E., Stolcke, A., Baron, D.: Observations on Overlap: Findings and implications for automatic processing of multi-party conversation. In: Proc. European Conf. on Speech Communication and Technology, pp. 1359–1362 (2001)

    Google Scholar 

  18. Shriberg, E., Dhillon, R., Bhagat, S., Ang, J., Carvey, H.: The ICSI Meeting Recorder Dialog Act (MRDA) Corpus. In: Proc. 5th SIGdial Workshop on Discourse and Dialogue, pp. 97–100 (2004)

    Google Scholar 

  19. Stolcke, A., Anguera, X., Boakye, K., Çetin, Ö., Grezl, F., Janin, A., Mandal, A., Peskin, B., Wooters, C., Zheng, J.: Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System. In: Proc. NIST RT-05 Meeting Recognition Workshop (2005)

    Google Scholar 

  20. Wrede, B., Bhagat, S., Dhillon, R., Shriberg, E.: Meeting Recorder Project: Hot Spot Labeling Guide, Technical Report TR-05-004, ICSI (2005)

    Google Scholar 

  21. Wrigley, S., Brown, G., Wan, V., Renals, S.: Speech and Crosstalk Detection in Multi-channel Audio. IEEE Trans. on Speech and Audio Processing 13, 84–91 (2005)

    Article  Google Scholar 

  22. Zimmermann, M., Liu, Y., Shriberg, E., Stolcke, A.: A* based Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings. In: Proc. Automatic Speech Recognition and Understanding Workshop, pp. 215–219 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Çetin, Ö., Shriberg, E. (2006). Overlap in Meetings: ASR Effects and Analysis by Dialog Factors, Speakers, and Collection Site. In: Renals, S., Bengio, S., Fiscus, J.G. (eds) Machine Learning for Multimodal Interaction. MLMI 2006. Lecture Notes in Computer Science, vol 4299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11965152_19

Download citation

  • DOI: https://doi.org/10.1007/11965152_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69267-6

  • Online ISBN: 978-3-540-69268-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics