Skip to main content

A Tree-Based Model of Prosodic Phrasing for Chinese Text-to-Speech Systems

  • Conference paper
  • First Online:
Book cover Advances in Multimedia Information Processing — PCM 2001 (PCM 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2195))

Included in the following conference series:

Abstract

This paper describes a tree-based model of prosodic phrasing for Chinese text-to-speech (TTS) systems. The model uses classification and regression trees (CART) techniques to generate the decision tree automatically. We collected 559 sentences from CCTV news program and built a corresponding speech corpus uttered by a professional male announcer. The prosodic boundaries were manually marked on the recorded speech, and word identification, part-of-speech tagging and syntactic analysis were also done on the text. A decision tree was then trained on 371 sentences (of approximately 50 min length), and tested on 188 sentences (of approximately 28 min length). Features for modeling prosody are proposed, and their effectiveness is measured by interpreting the resulting tree. We achieved a success rate of about 93%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ostendorf, M., Wightman, C.W.: Parse Scoring with Prosodic Information: an Analysis/ Synthesis Approach. Computer Speech and Language, 7 (1993) 193–210

    Article  Google Scholar 

  2. Bachenko, J., Fitzpatrick, E.: A Computational Grammar of Discourse-Neutral Prosodic Phrasing in English. Computational Linguistics, 16 (1990) 155–170

    Google Scholar 

  3. Willemse, R., Boves, L.: Context Free Wild Card Parsing in a Text-to-Speech System. In: ICASSP, (1991) 757–760

    Google Scholar 

  4. Taylor, P., Black, A.W.: Assigning Phrase Breaks from Part-of-Speech Sequences. Computer Speech and Language, 12 (1998) 99–117

    Article  Google Scholar 

  5. Muller, A.F., Zimmermann, H.G., Neuneier, R.: Robust Generation of Symbolic Prosody by a Neural Classifier Based on Autoassociators. In: ICASSP, (1996) 1285–1288

    Google Scholar 

  6. Wang, M.Q., Hirschberg, J.: Automatic Classification of Intonational Phrase Boundaries. Computer Speech and Language, 6 (1992) 175–196

    Article  Google Scholar 

  7. Hirschberg, J., Prieto, P.: Training Intonational Phrasing Rules Automatically for English and Spanish Text-to-Speech. Speech Communication, 18 (1996) 281–290

    Article  Google Scholar 

  8. Lee, S., Oh, Y.H. Tree-Based Modeling of Prosodic Phrasing and Segmental Duration for Korean TTS Systems. Speech Communication, 28 (1999) 283–300

    Article  Google Scholar 

  9. Fordyce, C.S., Ostendorf, M.: Prosody Prediction for Speech Synthesis Using Transformational Rule-Based Learning. In: ICSLP, (1998) 682–685

    Google Scholar 

  10. Chou, F.C., Tseng, C.Y., Chen, K.J.: A Chinese Text-to-Speech System Based on Part-Of Speech Analysis, Prosodic Modeling and Non-uniform Units. In: ICASSP, (1997) 923–926

    Google Scholar 

  11. Breiman L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Belmont, CA: Wadsworth (1984)

    MATH  Google Scholar 

  12. Bai, S.H.: The Study and Realization of Statistics Based Approach to Tagging Chinese Corpus. Master thesis, Tsinghua University, (1992) (In Chinese)

    Google Scholar 

  13. Chen, W.J., Lin, F.Z., Li, J.M., Zhang, B.: Prosodic Phrase Analysis Based on Probability and Statistics. Computer Engineering and Applications, 37 (2001) 10–12 (In Chinese)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, W., Linn, F., Li, J., Zhangh, B. (2001). A Tree-Based Model of Prosodic Phrasing for Chinese Text-to-Speech Systems. In: Shum, HY., Liao, M., Chang, SF. (eds) Advances in Multimedia Information Processing — PCM 2001. PCM 2001. Lecture Notes in Computer Science, vol 2195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45453-5_143

Download citation

  • DOI: https://doi.org/10.1007/3-540-45453-5_143

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42680-6

  • Online ISBN: 978-3-540-45453-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics