Skip to main content

Subtext Word Accuracy and Prosodic Features for Automatic Intelligibility Assessment

  • Conference paper
  • First Online:
  • 1341 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11107))

Abstract

Speech intelligibility for voice rehabilitation can successfully be evaluated by automatic prosodic analysis. In this paper, the influence of reading errors and the selection of certain words (nouns only, nouns and verbs, beginning of each sentence, beginnings of sentences and subclauses) for the computation of the word accuracy (WA) and prosodic features are examined. 73 hoarse patients read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 trained experts according to a 5-point scale. Combining prosodic features and WA by Support Vector Regression showed human-machine correlations of up to \(r=0.86\). They drop for files with few reading errors, however, but this can largely be evened out by feature set adjustment. WA should be computed on the whole text, but for some prosodic features, a subset of words may be sufficient.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Hustad, K., Dardis, C., McCourt, K.: Effects of visual information on intelligibility of open and closed class words in predictable sentences produced by speakers with dysarthria. Clin. Linguist. Phon 21, 353–367 (2007)

    Article  Google Scholar 

  2. Cutler, A.: Phonological cues to open- and closed-class words in the processing of spoken sentences. J. Psycholinguist Res. 22, 109–131 (1993)

    MathSciNet  Google Scholar 

  3. Grosjean, F., Gee, J.: Prosodic structure and spoken word recognition. Cognition 25, 135–155 (1987)

    Article  Google Scholar 

  4. Pichney, M., Durlach, N., Braida, L.: Speaking clearly for the hard of hearing. II: acoustic characteristics of clear and conversational speech. J. Speech Hear. Res. 29, 434–446 (1986)

    Article  Google Scholar 

  5. Turner, G., Tjaden, K.: Acoustic differences between content and function words in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 43, 769–781 (2000)

    Article  Google Scholar 

  6. Haderlein, T., Schützenberger, A., Döllinger, M., Nöth, E.: Robust automatic evaluation of intelligibility in voice rehabilitation using prosodic analysis. In: Ekštein, K., Matoušek, V. (eds.) TSD 2017. LNCS (LNAI), vol. 10415, pp. 11–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64206-2_2

    Chapter  Google Scholar 

  7. Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 325–332. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87391-4_42

    Chapter  Google Scholar 

  8. Haderlein, T., Döllinger, M., Matoušek, V., Nöth, E.: Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples. Logop. Phoniatr Vocol 41, 106–116 (2016)

    Google Scholar 

  9. International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  10. Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment. Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)

    Google Scholar 

  11. Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 195–202. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23538-2_25

    Chapter  Google Scholar 

  12. Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods Med. 2015, 11 (2015)

    Article  Google Scholar 

  13. Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The Prosody Module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). https://doi.org/10.1007/978-3-662-04230-4_8

    Chapter  Google Scholar 

  14. Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)

    Article  Google Scholar 

  15. Smola, A., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  16. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Acknowledgments

Dr. Döllinger’s contribution was supported by the German Research Foundation (DFG), grant no. DO1247/8-1 (no. 323308998).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tino Haderlein .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Haderlein, T., Schützenberger, A., Döllinger, M., Nöth, E. (2018). Subtext Word Accuracy and Prosodic Features for Automatic Intelligibility Assessment. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00794-2_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00793-5

  • Online ISBN: 978-3-030-00794-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics