Skip to main content

Multilingual Speech Corpora for TTS System Development

  • Conference paper
  • 1564 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Abstract

In this paper, four speech corpora collected in the Speech Lab of NCTU in recent years are discussed. They include a Mandarin tree-bank speech corpus, a Min-Nan speech corpus, a Hakka speech corpus, and a Chinese-English mixed speech corpus. Currently, they are used separately to develop a corpus-based Mandarin TTS system, a Min-Nan TTS system, a Hakka TTS system, and a Chinese-English bilingual TTS system. These systems will be integrated in the future to construct a multilingual TTS system covering the four primary languages used in Taiwan.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Huang, C.-R., Chen, K.-J., Chen, F.-Y., Chen, K.-J., Gao, Z.-M., Chen, K.-Y.: Sinica Treebank: Design Criteria, Annotation Guidelines, and Online Interface. In: Proceedings of 2nd Chinese Language Processing Workshop 2000, Hong Kong, pp. 29–37 (2000)

    Google Scholar 

  2. Wightman, C.W., Ostendorf, M.: Automatic Labeling of Prosodic Patterns. IEEE Transactions on Speech and Audio Processing 2(4), 469–481 (1994)

    Article  Google Scholar 

  3. Chiang, C.-Y., Wang, Y.-R., Chen, S.-H.: On the inter-syllable coarticulation effect of pitch modeling for Mandarin speech. In: Interspeech 2005, pp. 3269–3272 (2005)

    Google Scholar 

  4. Chou, F.-C., Tseng, C.-Y., Lee, L.-S.: A Set of Corpus-Based Text-to-Speech Synthesis Technologies for Mandarin Chinese. IEEE Transactions on Speech and Audio Processing 10(7), 481–494 (2002)

    Article  Google Scholar 

  5. Cheng, R.L.: Taiwanese pronunciation and Romanization – with rules and examples for teachers and students. Wang Wen Publishing Company (1993)

    Google Scholar 

  6. Kuo, W.-C., Wang, Y.-R., Chen, S.-H.: A model-based tone labeling method for Min-Nan/Taiwanese speech. In: ICASSP 2004 (2004)

    Google Scholar 

  7. Kuo, W.-C., Zhong, X.-R., Wang, Y.-R., Chen, S.-H.: High-Performance Min-Nan/Taiwanese TTS System. In: ICASSP 2003 (2003)

    Google Scholar 

  8. Kuo, W.-C., Lin, L.-F., Wang, Y.-R., Chen, S.-H.: An NN-based Approach to Prosodic Information Generation for Synthesizing English Words Embedded in Chinese Text. In: Proc. of Eurospeech 2003 (2003)

    Google Scholar 

  9. Chen, S.H., Hwang, S.H., Wang, Y.R.: An RNN-based Prosodic Information Synthesizer for Mandarin Text-to-Speech. IEEE Trans. Speech and Audio Processing 6(3), 226–239 (1998)

    Article  Google Scholar 

  10. Chen, S.-H., Lai, W.-H., Wang, Y.-R.: A New Duration Modeling Approach for Mandarin Speech. IEEE Trans. on Speech and Audio Processing 11(4) (July 2003)

    Google Scholar 

  11. Chen, S.-H., Lai, W.-H., Wang, Y.-R.: A statistics-based pitch contour model for Mandarin speech. J. Acoust. Soc. Am. 117(2), 908–925 (2005)

    Article  Google Scholar 

  12. Chen, S.H., Ho, C.C.: A Min-Nan Text-to-Speech System. In: ISCSLP 2000, Beijing, China (October 2000)

    Google Scholar 

  13. Yu, H.-M., Hwang, H.-T., Lin, D.-Y., Chen, S.-H.: A Hakka Text-to- Speech System. Submitted to ISCSLP 2006 (2006)

    Google Scholar 

  14. Chen, S.H., et al.: Technical Report of NCTU, MOE project EX-94-E-FA06-4-4

    Google Scholar 

  15. Cutler, A., Otake, T.: Rhythmic categories in spoken-word recognition. Journal of Memory and Language 46(2), 296–322 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hsiao, HC., Yu, HM., Wang, YR., Chen, SH. (2006). Multilingual Speech Corpora for TTS System Development. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_75

Download citation

  • DOI: https://doi.org/10.1007/11939993_75

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49665-6

  • Online ISBN: 978-3-540-49666-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics