Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems

Vích, Robert; Nouza, Jan; Vondra, Martin

doi:10.1007/978-3-540-70872-8_10

Robert Vích²³,
Jan Nouza²⁴ &
Martin Vondra²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5042))

1100 Accesses

Abstract

Speech intelligibility is the most important parameter in evaluation of speech quality. In the contribution, a new objective intelligibility assessment of general speech processing algorithms is proposed. It is based on automatic recognition methods developed for discrete and fluent speech processing. The idea is illustrated on two case studies: a) comparison of listening evaluation of Czech rhyme tests with automatic discrete speech recognition and b) automatic continuous speech recognition of general topic Czech texts read by professional and nonprofessional speakers vs. the same texts generated by several Czech Text-to-Speech systems. The aim of the proposed approach is fast and objective intelligibility assessment of Czech Text-to-Speech systems, which include male and female voices and a voice conversion module.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Perceived Length of Czech High Vowels in Relation to Formant Frequencies Evaluated by Automatic Speech Recognition

The Evaluation Process Automation of Phrase and Word Intelligibility Using Speech Recognition Systems

Current State of Text-to-Speech System ARTIC: A Decade of Research on the Field of Speech Technologies

References

Deller, J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, N. York (2000)
Google Scholar
Jekosch, U.: Voice and Speech Quality Perception, Assessment and Evaluation. Springer, Berlin (2005)
Google Scholar
Mahdi, A.E.: Voice Quality Measurement in Modern Telecommunication Networks. In: CD Proceedings of the 6th EURASIP Conference Focused on Speech & Image Processing, Multimedia Communications & Services (EC-SIPMCS), Maribor, Slovenia, June 27-30, pp. 29–36 (2007)
Google Scholar
Loizou, P.: C. Speech Enhancement. Theory and Practice. CRC Press, London (2007)
Google Scholar
Nouza, J., Vích, R., Vondra, M.: Can ASR be Used for Evaluating Speech Quality? In: Vích, R. (ed.) Proceedings of the 17th Czech-German Workshop Speech Processing, Prague, pp. 115–121 (2007)
Google Scholar
Vích, R., Nouza, J.: Application of Speech Recognition and Rhyme Tests for Assessement of Czech Speech Processing Systems. In: Duběda, T., Vlčková, J. (eds.) Proceedings of the 2nd Czech-Slovak Conference ISPhS, Karolinum, Prague, pp. 141–151 (2007)
Google Scholar
Červa, P., Nouza, J.: Design and Development of Voice Controlled Aids for Motor-Handicapped Persons. In: Proceedings of Interspeech, pp. 2521–2524. Antwerp (2007)
Google Scholar
Nouza, J., Žďánský, J., Červa, P., Kolorenč, J.: A System for Information Retrieval from Large Records of Broadcast Programs. In: Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence, LNAI, vol. 4188, pp. 401–408. Springer, Berlin (2006)
Google Scholar
Přibil, J., Přibilová, A.: Czech TTS Engine for BraillePen Device Based on Pocket PC Platform. In: Vích, R. (ed.) Proceedings of the 16th Conference Electronic Speech Signal Processing joined with the 15th Czech-German Workshop Speech Processing, Prague, pp. 402–408 (2005)
Google Scholar
Přibilová, A., Přibil, J.: Nonlinear Frequency Scale Mapping for Voice Conversion in Text-to-Speech System with Cepstral Description. Speech Communication 48, 1691–1703 (2006)
Article Google Scholar
Hanika, J., Horák, P.: Epos – A New Approach to Speech Synthesis. In: Proceedings of the First Workshop on Text, Speech and Dialogue – TSD 1998, Brno, pp. 51–54 (1998)
Google Scholar
Horák, P., Hesounová, A.: Czech Triphone Synthesis of Female Voice. In: Vích, R. (ed.) Proceedings of the 11th Czech-German Workshop Speech Processing, Prague, pp. 32–33 (2001)
Google Scholar
Horák, P., Hanika, J.: Epos Text-to-Speech System (2007), http://epos.ufe.cz/
Vondra, M., Vích, R.: Speech Identity Conversion. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 421–426. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, CZ 18251, Prague 8, Czech Republic
Robert Vích & Martin Vondra
Institute of Information Technology and Electronics, Technical University of Liberec, Hálkova 6, CZ 46117, Liberec, Czech Republic
Jan Nouza

Authors

Robert Vích
View author publications
You can also search for this author in PubMed Google Scholar
Jan Nouza
View author publications
You can also search for this author in PubMed Google Scholar
Martin Vondra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, Second University of Naples, and IIASS, Via Pellegrino 19, 84019, Vietri sul Mare (SA), Italy
Anna Esposito
ATRC Center, Wright State University, Dayton, OH, USA
Nikolaos G. Bourbakis
Human Computer Interaction Group, University of Patras, Rio Patras, Greece
Nikolaos Avouris
Department of Computer Engineering, University of Patras, Patras, Greece
Ioannis Hatzilygeroudis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vích, R., Nouza, J., Vondra, M. (2008). Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. Lecture Notes in Computer Science(), vol 5042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70872-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-70872-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70871-1
Online ISBN: 978-3-540-70872-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics