Skip to main content

Difficulties with Wh-Questions in Czech TTS System

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Included in the following conference series:

Abstract

The sentence intonation is very important for differentiation of sentence types (declarative sentences, questions, etc.), especially in languages without fixed word order. Thus, it is very important to deal with that also in text-to-speech systems. This paper concerns the problem of wh-question, where its intonation differs from the intonation of another basic question type – yes/no question. We discuss the possibility to use wh-questions (recorded during the speech corpus preparation) in speech synthesis. The inclusion and appropriate usage of these recordings is tested in a real text-to-speech system and evaluated by listening tests. Furthermore, we focus on the problem of the perception of wh-question by listeners, with the aim to reveal whether listeners really prefer phonologically correct (falling) intonation in this type of questions.

M. Jůzová—This research was supported by Ministry of Education, Youth and Sports of the Czech Republic, project No. LO1506, and by the grant of the University of West Bohemia, project No. SGS-2016-039.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that only neutral speech is taken into account in this paper since our TTS system does not currently involve emotions.

  2. 2.

    The important communication function in (complex/compound) declarative sentences is manifested with falling intonation only in the last phrase. Therefore, we sometimes substitute a term declarative sentence with a term declarative phrase.

References

  1. Cruttenden, A.: Intonation. Cambridge University Press, Cambridge (1997)

    Book  Google Scholar 

  2. Skarnitzl, R., Šturm, P., Volín, J.: Zvuková báze řečové komunikace: Fonetický a fonologický popis řeči. Univerzita Karlova, vydavatelství Karolinum, Praha (2016)

    Google Scholar 

  3. Romportl, J., Kala, J.: Prosody modelling in Czech text-to-speech synthesis. In: Proceedings of the 6th ISCA Workshop on Speech Synthesis, pp. 200–205. Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn (2007)

    Google Scholar 

  4. Palková, Z.: Fonetika a fonologie češtiny: s obecným úvodem do problematiky oboru. Univerzita Karlova, vydavatelství Karolinum, Praha (1994)

    Google Scholar 

  5. Matoušek, J., Tihelka, D., Romportl, J.: Current state of Czech text-to-speech system ARTIC. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 439–446. Springer, Heidelberg (2006)

    Google Scholar 

  6. Tihelka, D., Kala, J., Matoušek, J.: Enhancements of viterbi search for fast unit selection synthesis. In: INTERSPEECH 2010, Proceedings of 11th Annual Conference of the International Speech Communication Association, pp. 174–177 (2010)

    Google Scholar 

  7. Matoušek, J., Legát, M.: Is unit selection aware of audible artifacts?. In: Proceedings of the 8th Speech Synthesis Workshop, Barcelona, Spain, pp. 267–271 (2013)

    Google Scholar 

  8. Tihelka, D., Matoušek, J.: Unit selection and its relation to symbolic prosody: a new approach. In: INTERSPEECH 2006 - ICSLP, Proceedings of 9th International Conference on Spoken Language Procesing, vol. 1, pp. 2042–2045. ISCA, Bonn (2006)

    Google Scholar 

  9. Hanzlíček, Z.: Correction of prosodic phrases in large speech corpora. In: Sojka, P., et al. (eds.) TSD 2016. LNAI, vol. 9924, pp. 408–417. Springer, Heidelberg (2016)

    Google Scholar 

  10. Matoušek, J., Romportl, J.: On building phonetically and prosodically rich speech corpus for text-to-speech synthesis. In: Proceedings of the 2nd IASTED International Conference on Computational Intelligence, pp. 442–447. ACTA Press, San Francisco (2006)

    Google Scholar 

  11. Tihelka, D., Grůber, M., Hanzlíček, Z.: Robust methodology for TTS enhancement evaluation. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 442–449. Springer, Heidelberg (2013)

    Google Scholar 

  12. Volín, J., Bořil, T.: General and speaker-specific properties of F0 contours in short utterances. AUC Philologica 1/2014, Phonetica Pragensia XIII, pp. 101–112 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markéta Jůzová .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Jůzová, M., Tihelka, D. (2016). Difficulties with Wh-Questions in Czech TTS System. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45510-5_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45509-9

  • Online ISBN: 978-3-319-45510-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics