Skip to main content

Correction of Formal Prosodic Structures in Czech Corpora Using Legendre Polynomials

  • Conference paper
  • First Online:
Book cover Speech and Computer (SPECOM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11096))

Included in the following conference series:

Abstract

Naturalness is a very important aspect of speech synthesis that is necessary for a pleasant and undemanding listening and understanding of synthesized speech. However, in a unit selection, unexpected changes in \(F_0\) caused by units transitions can lead to an inconsistent prosody. This paper proposes a two-phased classification-based method that improves the overall prosody by correcting a formal prosodic description of speech corpora. For speech data representation, the authors decided to use Legendre polynomials.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bořil, T., Skarnitzl, R.: Tools rPraat and mPraat. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 367–374. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45510-5_42

    Chapter  Google Scholar 

  2. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)

    Article  Google Scholar 

  3. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software http://www.csie.ntu.edu.tw/~cjlin/libsvm

  4. Grabe, E., Kochanski, G., Coleman, J.: Connecting intonation labels to mathematical descriptions of fundamental frequency. Lang. Speech 50(Pt 3), 281–310 (2007)

    Article  Google Scholar 

  5. Hanzlíček, Z.: Classification of prosodic phrases by using HMMs. In: Král, P., Matoušek, V. (eds.) TSD 2015. LNCS (LNAI), vol. 9302, pp. 497–505. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24033-6_56

    Chapter  Google Scholar 

  6. Hanzlíček, Z.: Correction of prosodic phrases in large speech corpora. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 408–417. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45510-5_47

    Chapter  Google Scholar 

  7. Hanzlíček, Z., Grůber, M.: Initial experiments on automatic correction of prosodic annotation of large speech corpora. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS (LNAI), vol. 8655, pp. 481–488. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10816-2_58

    Chapter  Google Scholar 

  8. Jůzová, M., Tihelka, D., Volín, J.: On the extension of the formal prosody model for TTS. In: TSD. Lecture Notes in Computer Science. Springer (2018)

    Google Scholar 

  9. Legendre, A.M.: Recherches sur l’attraction des sphéroïdes homogènes. In: Mémoires de mathématique et de physique, presentés à l’Académie royale des sciences, par divers sçavans & lûs dans ses assemblées, Paris, pp. 411–435 (1785)

    Google Scholar 

  10. Matoušek, J., Legát, M.: Is unit selection aware of audible artifacts? In: SSW 2013. Proceedings of the 8th Speech Synthesis Workshop, pp. 267–271. ISCA, Barcelona, Spain (2013)

    Google Scholar 

  11. Matoušek, J., Tihelka, D., Romportl, J.: Current state of Czech text-to-speech system ARTIC. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 439–446. Springer, Heidelberg (2006). https://doi.org/10.1007/11846406_55

    Chapter  Google Scholar 

  12. Matoušek, J., Tihelka, D., Romportl, J.: Building of a speech corpus optimised for unit selection TTS synthesis. In: LREC 2008, pp. 1296–1299. ELRA, Marrakech, Morocco (2008)

    Google Scholar 

  13. Matoušek, J., Tihelka, D.: Anomaly-based annotation errors detection in tts corpora. In: INTERSPEECH, pp. 314–318. ISCA, Dresden, Germany (2015)

    Google Scholar 

  14. Matura, M., Jůzová, M.: Using anomaly detection for fine tuning of formal prosodic structures in speech synthesis. In: TSD. Lecture Notes in Computer Science, Springer (2018)

    Google Scholar 

  15. Palková, Z.: Rytmická, výstavba prozaického textu. Studia ČSAV; čís. 13/1974. Academia (1974)

    Google Scholar 

  16. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  17. Romportl, J.: Structural data-driven prosody model for TTS synthesis. In: Proceedings of the Speech Prosody 2006 Conference, pp. 549–552. TUDpress, Dresden (2006)

    Google Scholar 

  18. Romportl, J., Matoušek, J.: Formal prosodic structures and their application in NLP. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 371–378. Springer, Heidelberg (2005). https://doi.org/10.1007/11551874_48

    Chapter  Google Scholar 

  19. Tihelka, D., Grůber, M., Hanzlíček, Z.: Robust methodology for TTS enhancement evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS (LNAI), vol. 8082, pp. 442–449. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40585-3_56

    Chapter  Google Scholar 

  20. Tihelka, D., Hanzlíček, Z., Jůzová, M., Vít, J., Matoušek, J., Grůber, M.: Current state of text-to-speech system ARTIC: A decade of research on the field of speech technologies. In: TSD. Lecture Notes in Computer Science (2018)

    Google Scholar 

  21. Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi search for fast unit selection synthesis. In: INTERSPEECH, pp. 174–177. ISCA, Makuhari, Japan (2010)

    Google Scholar 

  22. Tihelka, D., Matoušek, J.: Unit selection and its relation to symbolic prosody: a new approach. In: INTERSPEECH, vol. 1, pp. 2042–2045. ISCA, Bonn (2006)

    Google Scholar 

  23. Volín, J., Tykalová, T., Bořil, T.: Stability of prosodic characteristics across age and gender groups. In: INTERSPEECH, pp. 3902–3906. ISCA, Stockholm, Sweden (2017)

    Google Scholar 

  24. Volín, J.: Extrakce základní hlasové frekvence a intonační gravitace v češtině. Naše řeč 92(5), 227–239 (2009)

    Google Scholar 

Download references

Acknowledgements

The work has been supported by the Ministry of Education, Youth and Sports of the Czech Republic project No. LO1506 and by the grant of the University of West Bohemia, project No. SGS-2016-039. Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Matura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Matura, M., Jůzová, M. (2018). Correction of Formal Prosodic Structures in Czech Corpora Using Legendre Polynomials. In: Karpov, A., Jokisch, O., Potapova, R. (eds) Speech and Computer. SPECOM 2018. Lecture Notes in Computer Science(), vol 11096. Springer, Cham. https://doi.org/10.1007/978-3-319-99579-3_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99579-3_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99578-6

  • Online ISBN: 978-3-319-99579-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics