Skip to main content

Hard vs. Fuzzy Clustering for Speech Utterance Categorization

  • Conference paper
Perception in Multimodal Dialogue Systems (PIT 2008)

Abstract

To detect and describe categories in a given set of utterances without supervision, one may apply clustering to a space therein representing the utterances as vectors. This paper compares hard and fuzzy word clustering approaches applied to ‘almost’ unsupervised utterance categorization for a technical support dialog system. Here, ‘almost’ means that only one sample utterance is given per category to allow for objectively evaluating the performance of the clustering techniques. For this purpose, categorization accuracy of the respective techniques are measured against a manually annotated test corpus of more than 3000 utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Acomb, K., Bloom, J., Dayanidhi, K., Hunter, P., Krogh, P., Levin, E., Pieraccini, R.: Technical Support Dialog Systems: Issues, Problems, and Solutions. In: Proc. of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies, Rochester, USA (2007)

    Google Scholar 

  2. Buckley, C.: Implementation of the SMART information retrieval system. Technical report, Cornell University, Ithaca, USA (1985)

    Google Scholar 

  3. Cleuziou, G., Martin, L., Vrain, C.: PoBOC: An Overlapping Clustering Algorithm. Application to Rule-Based Classication and Textual Data. In: Proc. of the ECAI, Valencia, Spain (2004)

    Google Scholar 

  4. Evanini, K., Suendermann, D., Pieraccini, R.: Call Classification for Automated Troubleshooting on Large Corpora. In: Proc. of the ASRU, Kyoto, Japan (2007)

    Google Scholar 

  5. Gorin, A., Riccardi, G., Wright, J.: How I Help You? Speech Communication 23(1/2) (1997)

    Google Scholar 

  6. Johnson, S.: Hierarchical Clustering Schemes. Psychometrika 32 (1967)

    Google Scholar 

  7. Minnen, G., Carrol, J., Pearce, D.: Applied Morphological Processing of English. Natural Language Engineering 7(3) (2001)

    Google Scholar 

  8. Montgomery, C.A.: A Vector Space Model for Automatic Indexing. Communication of the ACM 18(11) (1975)

    Google Scholar 

  9. Picard, J.: Finding Content-Bearing Terms using Term Similarities. In: Proc. of the EACL 1999, Bergen, Norway (1999)

    Google Scholar 

  10. Toutanova, K., Manning, C.: Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In: Proc. of the EMNLP/VLC, Hong Kong, China (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elisabeth André Laila Dybkjær Wolfgang Minker Heiko Neumann Roberto Pieraccini Michael Weber

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Albalate, A., Suendermann, D. (2008). Hard vs. Fuzzy Clustering for Speech Utterance Categorization. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69369-7_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69368-0

  • Online ISBN: 978-3-540-69369-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics