Tuning Limited Domain Speech Synthesis Using General Text-to-Speech System

Jůzová, Markéta; Tihelka, Daniel

doi:10.1007/978-3-319-10816-2_49

Markéta Jůzová²¹ &
Daniel Tihelka²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8655))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1562 Accesses

Abstract

The subject of the present paper is the building of a limited domain speech synthesis system, where longer units, like words and phrases, can naturally be concatenated together. However, instead of building a single-purpose domain-oriented engine working with longer units, we show that a general-purpose TTS system can be used as a good emulation tool to ensure that a real domain-oriented engine will work correctly. Since the current general speech synthesis system embedding unit selection method concatenates short speech units (diphones), the selection algorithm has been modified to pretend the concatenation of words or even the whole phrases, while still concatenating diphones internally. The behaviour of the system is tested on two limited domains and its output is compared to the output of general (unmodified) version of the same TTS system. The results show clear encouragement for the build of the “real” domain-oriented engine.

This work was supported by the European Regional Development Fund (ERDF), project “New Technologies for Information Society” (NTIS), European Centre of Excellence, CZ.1.05/1.1.00/02.0090, the Technology Agency of the Czech Republic, project No. TA01030476 and SGS-2013-032.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human

First Steps Towards Hybrid Speech Synthesis in Czech TTS System ARTIC

Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control

Article 08 August 2022

References

Black, A.W.: Perfect Synthesis for all of the people all of the time. In: IEEE Workshop on Speech Synthesis, Santa Monica, USA (2002)
Google Scholar
Black, A.W., Lenzo, K.A.: Limited Domain Synthesis. In: ICSLP 2000 (2000)
Google Scholar
Yi, J., Glass, J.: Natural-sounding speech synthesis using variable-length units. In: Proceedings of ICSLP, Sydney, Australia, pp. 1167–1170 (1998)
Google Scholar
Donovan, R.E., Franz, M., Sorensen, J.S., Roukos, S.: Phrase splicing and variable substitution using the ibm trainable speech synthesis system. In: Proc. of the Acoustics, Speech, and Signal Processing 1999, pp. 373–376. IEEE Computer Society, Washington, DC (1999)
Google Scholar
Matoušek, J., Romportl, J., Tihelka, D., Tychtl, Z.: Recent Improvements on ARTIC: Czech text-to-speech system. In: INTERSPEECH 2004 – ICSLP, Proc. of the 8th Int. Conf. on Spoken Language Processing, Korea, pp. 1933–1936 (2004)
Google Scholar
Matoušek, J., Tihelka, D., Romportl, J.: Current state of Czech text-to-speech system ARTIC. In: Sojka, P., Kopeček, I., Pala, K., et al. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 439–446. Springer, Heidelberg (2006)
Chapter Google Scholar
Matoušek, J., Tihelka, D., Romportl, J.: Building of a Speech Corpus Optimised for Unit Selection TTS Synthesis. In: Proc. of 6th Int. Conf. on Language Resources and Evaluation, LREC 2008, ELRA (2008)
Google Scholar
Jůzová, M., Tihelka, D.: Minimum Text Corpus Selection for Limited Domain Speech Synthesis. In: Sojka, P., et al. (eds.) Proc. of Text, Speech and Dialogue (2014)
Google Scholar
Švec, J., Šmídl, L.: Prototype of Czech Spoken Dialog System with Mixed Initiative for Railway Information Service. In: Sojka, P., Horák, A., Kopeček, I., Pala, K., et al. (eds.) TSD 2010. LNCS (LNAI), vol. 6231, pp. 568–575. Springer, Heidelberg (2010)
Chapter Google Scholar
Ramportl, J., Tihelka, D.: Exploring Automatic Similarity Measures for Unit Selection Tuning. In: Proc. of Int. Conf. Interspeech 2009, pp. 736–739 (2009)
Google Scholar
Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi Search for Fast Unit Selection Synthesis. In: Proc. of Int. Conf. Interspeech 2010, pp. 174–177 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

University of West Bohemia, Univerzitní 8, Plzeň, Czech Republic
Markéta Jůzová & Daniel Tihelka

Authors

Markéta Jůzová
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Tihelka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Botanicá 6a, 60200, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jůzová, M., Tihelka, D. (2014). Tuning Limited Domain Speech Synthesis Using General Text-to-Speech System. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-10816-2_49
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tuning Limited Domain Speech Synthesis Using General Text-to-Speech System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human

First Steps Towards Hybrid Speech Synthesis in Czech TTS System ARTIC

Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Tuning Limited Domain Speech Synthesis Using General Text-to-Speech System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human

First Steps Towards Hybrid Speech Synthesis in Czech TTS System ARTIC

Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation