F0 Prediction Model of Speech Synthesis Based on Template and Statistical Method

Tao, Jianhua

doi:10.1007/978-3-540-30120-2_63

Jianhua Tao²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

875 Accesses
1 Citations

Abstract

The paper describes a F0 model based on template and statistical method in speech synthesis. Being focused on the notion of templates, we confirmed that F0 patterns for a speech unit can be extracted from various anamorphosis of F0 contours in spontaneous speech. Furthermore, prosody cost function and statistical training method are used to assign and adapt the weights of template selection in real application. Unlike other methods, the approach may give feedback as to exactly what are the crucial parameters determining the successful choice of patterns. Final test proves the method in the paper can generate the synthesized speech with high naturalness, and is also much suitable to the multilingual prosody processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jianhua, T., et al.: Clustering and feature learning based F0 prediction for Chinese speech synthesis. In: ICSLP 2002, Denver (2002)
Google Scholar
Ross, K.: Modeling of intonation for speech synthesis, Ph.D. Thesis, College of Engineering, Boston University (1995)
Google Scholar
Jianhua, T., et al.: Trainable prosodic model for standard Chinese Text-to-Speech system. Chinese Journal of Acoustic 20, 257–265 (2001)
Google Scholar
Jensen, U., Moore, R.K., Dalsgaard, P., Lindberg, B.: Modeling intonation contours at the phrase level using continuous density hidden Markov models. Computer Speech and Language 8, 247–260 (1994)
Article Google Scholar
Shih, C., Kochanski, G.P.: Chinese Tone Modeling with Stem-ML. In: ICSLP 2000 (2000)
Google Scholar
Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: ICASSP 1996 (1996)
Google Scholar
Fujisaki, H., et al.: Analysis and modeling of tonal features in polysyllabic words and sentences of the Standard Chinese. ICSLP 2, 841–844 (1990)
Google Scholar
Wu, Z.J.: Tone-sandhi in sentences in Standard Chinese. Chinese of China (6), 439–450
Google Scholar
Fujisaki, H., et al.: Analysis and modeling of tonal features in polysyllabic words and sentences of the Standard Chinese. ICSLP 2, 841–844 (1990)
Google Scholar
Mueller, A., Tao, J., Hoffmann, R.: Data-driven importance analysis of linguistic and phonetic information. In: ICSLP 2000 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing, China, 100080
Jianhua Tao

Authors

Jianhua Tao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tao, J. (2004). F0 Prediction Model of Speech Synthesis Based on Template and Statistical Method. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_63

Download citation

DOI: https://doi.org/10.1007/978-3-540-30120-2_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

F0 Prediction Model of Speech Synthesis Based on Template and Statistical Method