Skip to main content

Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems

  • Conference paper
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5042))

Abstract

Speech intelligibility is the most important parameter in evaluation of speech quality. In the contribution, a new objective intelligibility assessment of general speech processing algorithms is proposed. It is based on automatic recognition methods developed for discrete and fluent speech processing. The idea is illustrated on two case studies: a) comparison of listening evaluation of Czech rhyme tests with automatic discrete speech recognition and b) automatic continuous speech recognition of general topic Czech texts read by professional and nonprofessional speakers vs. the same texts generated by several Czech Text-to-Speech systems. The aim of the proposed approach is fast and objective intelligibility assessment of Czech Text-to-Speech systems, which include male and female voices and a voice conversion module.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deller, J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, N. York (2000)

    Google Scholar 

  2. Jekosch, U.: Voice and Speech Quality Perception, Assessment and Evaluation. Springer, Berlin (2005)

    Google Scholar 

  3. Mahdi, A.E.: Voice Quality Measurement in Modern Telecommunication Networks. In: CD Proceedings of the 6th EURASIP Conference Focused on Speech & Image Processing, Multimedia Communications & Services (EC-SIPMCS), Maribor, Slovenia, June 27-30, pp. 29–36 (2007)

    Google Scholar 

  4. Loizou, P.: C. Speech Enhancement. Theory and Practice. CRC Press, London (2007)

    Google Scholar 

  5. Nouza, J., Vích, R., Vondra, M.: Can ASR be Used for Evaluating Speech Quality? In: Vích, R. (ed.) Proceedings of the 17th Czech-German Workshop Speech Processing, Prague, pp. 115–121 (2007)

    Google Scholar 

  6. Vích, R., Nouza, J.: Application of Speech Recognition and Rhyme Tests for Assessement of Czech Speech Processing Systems. In: Duběda, T., Vlčková, J. (eds.) Proceedings of the 2nd Czech-Slovak Conference ISPhS, Karolinum, Prague, pp. 141–151 (2007)

    Google Scholar 

  7. Červa, P., Nouza, J.: Design and Development of Voice Controlled Aids for Motor-Handicapped Persons. In: Proceedings of Interspeech, pp. 2521–2524. Antwerp (2007)

    Google Scholar 

  8. Nouza, J., Žďánský, J., Červa, P., Kolorenč, J.: A System for Information Retrieval from Large Records of Broadcast Programs. In: Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence, LNAI, vol. 4188, pp. 401–408. Springer, Berlin (2006)

    Google Scholar 

  9. Přibil, J., Přibilová, A.: Czech TTS Engine for BraillePen Device Based on Pocket PC Platform. In: Vích, R. (ed.) Proceedings of the 16th Conference Electronic Speech Signal Processing joined with the 15th Czech-German Workshop Speech Processing, Prague, pp. 402–408 (2005)

    Google Scholar 

  10. Přibilová, A., Přibil, J.: Nonlinear Frequency Scale Mapping for Voice Conversion in Text-to-Speech System with Cepstral Description. Speech Communication 48, 1691–1703 (2006)

    Article  Google Scholar 

  11. Hanika, J., Horák, P.: Epos – A New Approach to Speech Synthesis. In: Proceedings of the First Workshop on Text, Speech and Dialogue – TSD 1998, Brno, pp. 51–54 (1998)

    Google Scholar 

  12. Horák, P., Hesounová, A.: Czech Triphone Synthesis of Female Voice. In: Vích, R. (ed.) Proceedings of the 11th Czech-German Workshop Speech Processing, Prague, pp. 32–33 (2001)

    Google Scholar 

  13. Horák, P., Hanika, J.: Epos Text-to-Speech System (2007), http://epos.ufe.cz/

  14. Vondra, M., Vích, R.: Speech Identity Conversion. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 421–426. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vích, R., Nouza, J., Vondra, M. (2008). Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. Lecture Notes in Computer Science(), vol 5042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70872-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70872-8_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70871-1

  • Online ISBN: 978-3-540-70872-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics