Skip to main content

An Improvement of Prosodic Characteristics in Vietnamese Text to Speech System

  • Conference paper
Knowledge and Systems Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 244))

Abstract

One important goal of TTS system is to generate natural-sounding synthesized voice. To meet the goal, a variety of tasks are performed to model the prosodic aspects of TTS voice. The task being discussed here is POS and Intonation tagging. The paper examines the effects of POS and Intonation information on the naturalness of a hidden Markov model (HMM) based speech when other resources are not available. It is discovered that, when a limited feature set is used for HMM context labels, the POS and Intonation tags improve the naturalness of the synthesized voice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yamagishi, J., Ogata, K., Nakano, Y., Isogai, J., Kobayashi, T.: HSMM-Based Model adaptation algorithms for Average-Voice-Based speech synthesis. In: ICASSP 2006, pp. 77–80 (2006)

    Google Scholar 

  2. Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T.: Speech parameter generation algorithms for HMM-based speech synthesis. In: Proc. ICASSP 2000, pp. 1315–1318 (June 2000)

    Google Scholar 

  3. Mixdorff, H., Nguyen, H.B., Fujisaki, H., Luong, C.M.: Quantitative Analysis and Synthesis of Syllabic Tones in Vietnamese. In: Proc. EUROSPEECH, Geneva, pp. 177–180 (2003)

    Google Scholar 

  4. Le, P.N., Ambikairajah, E., Choi, E.H.C.: Improvement of Vietnamese Tone Classification using FM and MFCC Features. In: Computing and Communication Technologies RIVF 2009, pp. 01–04 (2009)

    Google Scholar 

  5. Schlunz, G.I., Barnard, E., Van Huyssteen, G.B.: Part-of-speech effects on text-to-speech synthesis. In: 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, November 22-23, pp. 257–262 (2010)

    Google Scholar 

  6. Phan, S.T., Vu, T.T., Duong, C.T., Luong, M.C.: A study in Vietnam-ese statistical parametric speech synthesis base on HMM. IJACST 2(1), 01–06 (2013)

    Google Scholar 

  7. Phan, S.T., Vu, T.T., Luong, M.C.: Extracting MFCC, F0 feature in Vietnamese HMM-based speech synthesis. International Journal of Electronics and Computer Science Engineering 2(1), 46–52 (2013)

    Google Scholar 

  8. Lê, T.-H., Nguyen, A.-V., Truong, H.V., Van Bui, H., Lê, D.: A Study on Vietnamese Prosody. In: Nguyen, N.T., Trawiński, B., Jung, J.J. (eds.) New Challenges for Intelligent Information and Database Systems. SCI, vol. 351, pp. 63–73. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Vu, T.T., Luong, M.C., Nakamura, S.: An HMM-based Vietnamese Speech Synthesis System. In: Proc. Oriental COCOSDA, pp. 116–121 (2009)

    Google Scholar 

  10. Doan, T.T.: Vietnamese Acoustic, Vietnamese National Editions, 2nd edn. (2003)

    Google Scholar 

  11. Vu, T.T., Nguyen, D.T., Luong, M.C., Hosom, J.P.: Vietnamese large vocabulary continuous speech recognition. In: Proc. INTERSPEECH, pp. 1689–1692 (2005)

    Google Scholar 

  12. Department of Computer Science, Nagoya Institute of Technology: Speech Signal Processing Toolkit, SPTK 3.6. Reference manual, Japan (December 2003), http://sourceforge.net/projects/sp-tk/ (updated December 25, 2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh Son Phan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Phan, T.S., Dinh, A.T., Vu, T.T., Luong, C.M. (2014). An Improvement of Prosodic Characteristics in Vietnamese Text to Speech System. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 244. Springer, Cham. https://doi.org/10.1007/978-3-319-02741-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02741-8_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02740-1

  • Online ISBN: 978-3-319-02741-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics