Skip to main content

A New Spectral Smoothing Algorithm for Unit Concatenating Speech Synthesis

  • Conference paper
  • 1766 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3809))

Abstract

Speech unit concatenation with a large database is presently the most popular method for speech synthesis. In this approach, the mismatches at the unit boundaries are unavoidable and become one of the reasons for quality degradation. This paper proposes an algorithm to reduce undesired discontinuities between the subsequent units. Optimal matching points are calculated in two steps. Firstly, the Kullback-Leibler distance measurement is utilized for the spectral matching, then the unit sliding and the overlap windowing are used for the waveform matching. The proposed algorithm is implemented for the corpus-based unit concatenating Korean text-to-speech system that has an automatically labeled database. Experimental results show that our algorithm is fairly better than the raw concatenation or the overlap smoothing method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hunt, A.J., Black, A.W.: Unit Selection in a Concatenative Speech Synthesis System using a Large Speech Database. In: Proc. IEEE ICASSP, pp. 959–962 (1996)

    Google Scholar 

  2. Low, P.H., Vaseghi, S.: Synthesis of Unseen Context and Spectral and Pitch Contour Smoothing in Concatenated Text to Speech Synthesis. In: Proc. IEEE ICASSP, pp. 469–472 (2002)

    Google Scholar 

  3. Chappell, D.T., Hansen, J.H.L.: A Comparison of Spectral Smoothing Methods for Segment Concatenation based Speech Synthesis. Speech Communication 36, 343–374 (2002)

    Article  MATH  Google Scholar 

  4. Pfister, B.: High-Quality Prosodic Modification of Speech Signals. In: Proc. ISCLP, pp. 2446–2449 (1996)

    Google Scholar 

  5. Conkie, A.D., Isard, S.: Optimal Coupling of Diphones. In: Progress in Speech Synthesis, ch. 23, pp. 293–304. Springer, Heidelberg (1997)

    Google Scholar 

  6. Klabbers, E., Veldhuis, R.: On the Reduction of Concatenation Artifacts in Diphone Synthesis. In: Proc. ICSLP, pp. 1983–1986 (1998)

    Google Scholar 

  7. Klabbers, E., Veldhuis, R.: Reducing Audible Spectral Discontinuities. IEEE Transactions on Speech and Audio Processing, 39–51 (2001)

    Google Scholar 

  8. Shin, J.-Y.: Understanding of Korean Speech (printed in Korean), Hankook-Moonwha-sa, Korea (2000)

    Google Scholar 

  9. Huang, X., Acero, A., Hon, H.: Spoken Lagnuage Processing, pp. 840–842. Prentice-Hall, Englewood Cliffs (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, SJ., Jang, K.A., Han, H.B., Hahn, M. (2005). A New Spectral Smoothing Algorithm for Unit Concatenating Speech Synthesis. In: Zhang, S., Jarvis, R. (eds) AI 2005: Advances in Artificial Intelligence. AI 2005. Lecture Notes in Computer Science(), vol 3809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11589990_57

Download citation

  • DOI: https://doi.org/10.1007/11589990_57

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30462-3

  • Online ISBN: 978-3-540-31652-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics