Skip to main content

Intelligibility Is More Than a Single Word: Quantification of Speech Intelligibility by ASR and Prosody

  • Conference paper
Text, Speech and Dialogue (TSD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4629))

Included in the following conference series:

Abstract

In this paper we examine the quality of the prediction of intelligibility scores of human experts. Furthermore, we investigate the differences between subjective expert raters who evaluated speech disorders of laryngectomees and children with cleft lip and palate. We use the recognition rate of a word recognizer and prosodic features to predict the intelligibility score of each individual expert. For each expert and the mean opinion of all experts we present the best features to model their scoring behavior according to the mean rank obtained during a 10-fold cross-validation. In this manner all individual speech experts were modeled with a correlation coefficient of at least r > .75. The mean opinion of all raters is predicted with a correlation of r =.90 for the laryngectomees and r =.86 for the children.

This work was supported by the Johannes-und-Frieda-Marohn Stiftung and the Deutsche Forschungsgemeinschaft (German Research Foundation) under grant SCHU2320/1-1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Haderlein, T.: Nöth, E., Schuster, M., Eysholdt, U., Rosanowski, F.: Evaluation of Tracheoesophageal Substitute Voices Using Prosodic Features. In: Hoffmann, R., Mixdorff, H. (eds.) Proc. Speech Prosody, 3rd International Conference, Dresden, Germany, TUDpress, pp. 701–704 (2006)

    Google Scholar 

  2. Harding, A., Grunwell, P.: Active versus passive cleft-type speech characteristics. Int. J. Lang. Commun. Disord. 33(3), 329–352 (1998)

    Article  Google Scholar 

  3. Fox, A.: PLAKSS - Psycholinguistische Analyse kindlicher Sprechstörungen. Swets & Zeitlinger, Frankfurt a.M., now available from Harcourt Test Services GmbH, Germany (2002)

    Google Scholar 

  4. Schukat-Talamazzini, E., Niemann, H., Eckert, W., Kuhn, T., Rieck, S.: Automatic Speech Recognition without Phonemes. In: Proceedings European Conference on Speech Communication and Technology (Eurospeech), Berlin, Germany, pp. 129–132 (1993)

    Google Scholar 

  5. Wahlster, W. (ed.): Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin (2000)

    MATH  Google Scholar 

  6. Stemmer, G.: Modeling Variability in Speech Recognition. PhD thesis, Chair for Pattern Recognition, University of Erlangen-Nuremberg, Germany (2005)

    Google Scholar 

  7. Gales, M., Pye, D., Woodland, P.: Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation. In: Proc. ICSLP 1996. Philadelphia, USA, vol. 3 pp. 1832–1835 (1996)

    Google Scholar 

  8. Maier, A., Hacker, C., Nöth, E., Nkenke, E., Haderlein, T., Rosanowski, F., Schuster, M.: Intelligibility of children with cleft lip and palate: Evaluation by speech recognition techniques. In: Proc. International Conf. on Pattern Recognition. Hong Kong, China, vol. 4, pp. 274–277 (2006)

    Google Scholar 

  9. Schuster, M., Maier, A., Haderlein, T., Nkenke, E., Wohlleben, U., Rosanowski, F., Eysholdt, U., Nöth, E.: Evaluation of Speech Intelligibility for Children with Cleft Lip and Palate by Automatic Speech Recognition. Int. J. Pediatr. Otorhinolaryngol. 70, 1741–1747 (2006)

    Article  Google Scholar 

  10. Kießling, A.: Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung. Berichte aus der Informatik. Shaker, Aachen (1997)

    Google Scholar 

  11. Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: The Use of Prosody in the Linguistic Components of a Speech Understanding System. IEEE Trans. on Speech and Audio Processing 8, 519–532 (2000)

    Article  Google Scholar 

  12. Batliner, A., Buckow, A., Niemann, H., Nöth, E., Warnke, V.: The Prosody Module. [5], pp. 106–121

    Google Scholar 

  13. Smola, A., Schölkopf, B.: A tutorial on support vector regression. In: NeuroCOLT2 Technical Report Series, NC2-TR-1998-030 (1998)

    Google Scholar 

  14. Cohen, J., Cohen, P.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Hillsdale, New Jersey (1983)

    Google Scholar 

  15. Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand (1998)

    Google Scholar 

  16. Liu, H., Setiono, R.: A probabilistic approach to feature selection - a filter solution. In: 13th International Conference on Machine Learning, pp. 319–327 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Václav Matoušek Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maier, A., Haderlein, T., Schuster, M., Nkenke, E., Nöth, E. (2007). Intelligibility Is More Than a Single Word: Quantification of Speech Intelligibility by ASR and Prosody. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74628-7_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74627-0

  • Online ISBN: 978-3-540-74628-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics