Abstract
This paper presents a method for selecting speech units for polyphone concatenative speech synthesis, in which the simplification of procedures for search paths in a graph accelerated the speed of the unit-selection procedure with minimum effects on the speech quality. The speech units selected are still optimal; only the costs of merging the units on which the selection is based are less accurately determined. Due to its low processing power and memory footprint requirements, the method is applicable in embedded speech synthesizers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Vesnicer, B., Mihelič, F.: Evaluation of the Slovenian HMM-Based Speech Synthesis System. LNCS, pp. 513–520. Springer, Heidelberg (2006)
Campbell, W.N.: Processing a speech corpus for CHATR synthesis. In: Proceedings of the ICSP, Seul, Korea, pp. 183–186 (1997)
Toda, T., Kawa, H., Tsuzak, M.: Optimizing Sub-Cost Functions For Segment Selection Based On Perceptual Evaluations In Concatenative Speech Synthesis. In: Proceedings of the ICASSP 2004, pp. 657–660 (2004)
Vepa, J., King, S.: Subjective Evaluation Of Joint Cost Functions Used In Unit Selection Speech Synthesis. In: Proceedings of the InterSpeech 2004, pp. 1181–1184 (2004)
Breuer, S., Abresch, J., Phoxsy, X.: Multi-phone Segments for Unit Selection Speech Synthesis. In: Proceedings of the InterSpeech 2004, Institute for Communication Research and Phonetics (IKP) University of Bonn (2004)
Allauzen, C., Mohri, M., Riley, M.: DCD Library – Decoder Library, software collection for decoding and related functions. In: AT&T Labs – Research (2003)
Allauzen, C., Mohri, M., Roark, B.: A General Weighted Grammar Library. In: Proceedings of the Ninth International Conference on Automata (CIAA 2004), Kingston, Canada (2004)
Yi, J.R.W.: Corpus-Based Unit Selection for Natural-Sounding Speech Synthesis. Ph.D. Thesis, Massachusetts Institute of Technology (2003)
Mihelič, A., Žganec Gros, J., Pavešič, N., Žganec, M.: Efficient Subset Selection from Phonetically Transcribed Text Corpora for Concatenation-based Embedded Text-to-speech Synthesis. Inf. MIDEM 36(1), 19–24 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mihelič, A., Gros, J.Ž. (2008). Efficient Unit-Selection in Text-to-Speech Synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_53
Download citation
DOI: https://doi.org/10.1007/978-3-540-87391-4_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)